with a MIME type.
Whenever that MIME is detected, Bro will now automatically activate
the analyzer. The interface mimics how well-known ports are defined
for protocol analyzers.
This isn't actually used by any existing file analyzer (because we
don't have any yet that target a specific file format), but there's a
test making sure it works.
This prevents the worker nodes from crashing, when request_key is used
in cluster mode and called on the worker and the manager nodes (i.e. when
a non-cluster-aware script is used).
Addresses BIT-1177
- It's not *exactly* ISO 8601 which doesn't seem to support
subseconds, but subseconds are very important to us and
most things that support ISO8601 seem to also support subseconds
in the way I'm implemented it.
- Formatters have been abstracted similarly to readers and writers now.
- The Ascii writer has a new option for writing out logs as JSON.
- The Ascii writer now has all options availble as per-filter
options as well as global.
- First:
Due to architectural constraints, it is very hard for the
input framework to handle optional records. For an optional record,
either the whole record has to be missing, or all non-optional elements
of the record have to be defined. This information is not available
to input readers after the records have been unrolled into the threading
types.
Behavior so far was to treat optional records like they are non-optional,
without warning. The patch changes this behavior to emit an error on stream-
creation (during type-checking) and refusing to open the file. I think this
is a better idea - the behavior so far was undocumented and unintuitive.
- Second:
For table and event streams, reader backend creation was done very early,
before actually checking if all arguments are valid. Initialization is moved
after the checks now - this makes a number of delete statements unnecessary.
Also - I suspect threads of failed input reader instances were not deleted
until shutdown
- Third:
Add a couple more consistency checks, e.g. checking if the destination value
of a table has the same type as we need. We did not check everything in all
instances, instead we just assigned the things without caring (which works,
but is not really desirable).
This change also exposed a few bugs in other testcases where table definitions
were wrong (did not respect $want_record)
- Fourth:
Improve error messages and write testcases for all error messages (I think).
Sorry for this - I noticed that I named this option quite unfortunately
while writing the documentation.
The patch also removes the dbname configuration option from the sqlite
input reader - it was not used there at all anymore (and I did not notice
that).
* origin/topic/bernhard/hyperloglog: (32 commits)
add clustered leak test for hll. No issues.
make gcc happy
(hopefully) fix refcounting problem in hll/bloom-filter opaque vals. Thanks Robin.
re-use same hash class for all add operations
get hll ready for merging
and forgot a file...
adapt to new structure
fix opaqueval-related memleak.
make it compile on case-sensitive file systems and fix warnings
make error rate configureable
add persistence test not using predetermined random seeds.
update cluster test to also use hll
persistence really works.
well, with this commit synchronizing the data structure should work.. ...if we had consistent hashing.
and also serialize the other things we need
ok, this bug was hard to find.
serialization compiles.
change plugin after feedback of seth
Forgot a file. Again. Like always. Basically.
do away with old file.
...
- Generally increased the time allowed before they timeout.
- For tests w/ a clear termination condition (most of them), made
timeouts result in a test failure.
- Seemed to be a race in some cases between tests generating output and
the input reader stream getting removed/closed, so moved stream removal
closer to termination time, when all output should be available.
- Primarily working around an issue that occurs when threads
concurrently create pipes and fork a child process. See comment in
code...
- Other minor cleanup of the code: making sure the child process calls
_exit() versus exit(), limits itself to few select system calls before
the exec(), and closes more unused file descriptors.
* origin/topic/seth/sumstats-updates:
Still fixing bugs in sumstats updated api cluster support.
Hopefully fix the SumStats cluster support.
Fix the SumStats top-k plugin and test.
Updates for SumStats API to deal with high memory stats.
Beginning rework of SumStats API.
Tiny fix to account for missing str field (not sure how this happens yet)
Add server samples to SSH bruteforce detection.
Fix a reporter message in sumstats.
SumStats changes to how thresholding works to simplify and reduce memory use.
More adjustments to try and correct SumStats memory use.
Hopefully fixing a strange error.
Large update for the SumStats framework.
- Do stream mode for commands done by exec module, it seems important
in some cases (e.g. ensure requested stdin is fully written).
- For cases where the raw input reader knows the child process has been
reaped, set the childpid member to a sentinel value to indicate such
so we don't later think we should kill it or wait on it anymore.
- More error checking on dup2/close calls. Set sentinel values when
closing ends of pipes to prevent double closing a fd.
- Signal flag not set when raw input reader's child exits as a result
of a signal. Left out a test for this -- might be portability issues
(e.g. Ubuntu seems to do things different regarding the exit code and
also is printing "Killed" to stderr where other platforms don't).
* origin/topic/seth/faf-updates: (27 commits)
Undoing the FTP tests I updated earlier.
Update the last two btest FAF tests.
File analysis fixes and test updates.
Fix a bug with getting analyzer tags.
A few test updates.
Some tests work now (at least they all don't fail anymore!)
Forgot a file.
Added protocol description functions that provide a super compressed log representation.
Fix a bug where orig file information in http wasn't working right.
Added mime types to http.log
Clean up queued but unused file_over_new_connections event args.
Add jar files to the default MHR lookups.
Adding CAB files for MHR checking.
Improve malware hash registry script.
Fix a small issue with finding smtp entities.
Added support for files to the notice framework.
Make the custom libmagic database a git submodule.
Add an is_orig parameter to file_over_new_connection event.
Make magic for emitting application/msword mime type less strict.
Disable more libmagic builtin checks that override the magic database.
...
Conflicts:
doc/scripts/DocSourcesList.cmake
scripts/base/init-bare.bro
scripts/test-all-policy.bro
testing/btest/Baseline/coverage.bare-load-baseline/canonified_loaded_scripts.log
testing/btest/Baseline/coverage.default-load-baseline/canonified_loaded_scripts.log
- Several places were just using old variable names or not loading
scripts correctly after they'd been renamed/moved.
- Revert/adjust a change in how HTTP file handles are generated that
broke partial content responses.
- Turn some libmagic builtin checks back on; seems some are actually
useful (e.g. text detection seems to be a builtin). The rule going
forward probably will be only to turn off a builtin if we confirm it
causes issues.
- Removed some tests that are redundant or not necessary anymore because
the generic file analysis tests cover them.
- A couple FTP tests still fail that I think need an actual solution via
script changes.
- Intel importing format has changed (refer to docs).
- All string matching is now case insensitive.
- SMTP intel script has been updated to extract email
addresses correctly.
- Small fix sneaking into the smtp base script to actually
extract individual email addresses in the To: field
correctly.
Closes#1021.
* origin/topic/bernhard/input-update:
this event handler fails the unused-event-handlers test because it is a bit of a special case.
...and fix the event ordering issue. Dispatch != QueueEvent
add Terminate to input framework to prevent potential shutdown race-conditions.
fix warning.
fix stderr test. ls behaves differently on errors on linux...
small fixes.
linux does not have strnstr
and close only fds that are currently open (the logging framework really did not like that :) )
A bunch of more changes for the raw reader
make reading from stdout and stderr simultaneously work.
allow sending data to stdin of child process
Streaming reads from external commands work without blocking anything.
replace popen with fork and exec.
change raw reader to use basic c io instead of fdstream encapsulation class.