* topic/robin/ascii-escape-normalization:
Updating NEWS.
In bifs, change ODesc objects to have RAW_STYLE.
Changing what's escaped when printing.
Remove several BroString escaping methods that are no longer useful.
BIT-1333 #merged
one can now add an option "offset" to the config map. Positive offsets
are interpreted to be from the beginning of the file, negative from the
end of the file (-1 is end of file).
Only works for raw reader in streaming or manual mode. Does not work
with executables.
Addresses BIT-985
With this patch the model is:
- "print" cleans the data so that non-printable characters get
escaped. This is not necessarily reversible.
- to print in a reversible way, one can go through
escape_string(); this escapes backslashes as well to make the
decoding non-ambigious.
- Logging always escapes similar to escape_string(), making it
reversible.
Compared to master, we also change the escaping as follows:
- We now only escape with "\xXX", no more "^X" or "\0". Exception:
backslashes.
- We escape backlashes as "\\".
- There's no "alternative" output style anymore, i.e., fmt() '%A'
qualifier is gone.
Baselines in testing/btest are updated, external tests not yet.
Addresses BIT-1333.
I replaced a few strcmps with either calls to std::str.compare
or with the == operator of BroString.
Also changed two of the input framework tests that did not pass
anymore after the merge. The new SSH analyzer no longer loads the
scripts that let network time run, hence those tests failed because
updates were not propagated from the threads (that took a while
to find.)
* origin/topic/vladg/ssh: (25 commits)
SSH: Register analyzer for 22/tcp.
SSH: Add 22/tcp to likely_server_ports
SSH: Ignore encrypted packets by default.
SSH: Fix some edge-cases which created BinPAC exceptions
SSH: Add memleak btest
SSH: Update baselines
SSH: Added some more events for SSH2
SSH: Intel framework integration (PUBKEY_HASH)
Update baselines for new SSH analyzer.
Update SSH policy scripts with new events.
SSH: Add documentation
Refactoring ssh-protocol.pac:
SSH: Use the compression_algorithms const in another place.
Some cleanup and refactoring on SSH main.bro.
SSH: A bit of code cleanup.
Move SSH constants to consts.pac
SSH: Cleanup code style.
SSH: Fix some memleaks.
Refactored the SSH analyzer. Added supported for algorithm detection and more key exchange message types.
Add host key support for SSH1.
Add support for SSH1
Move SSH analyzer to new plugin architecture.
...
Conflicts:
scripts/base/protocols/ssh/main.bro
testing/btest/Baseline/core.print-bpf-filters/output2
testing/btest/Baseline/plugins.hooks/output
BIT-1344: #merged
into BroVals (especially enums)
Not we do not force an internal error anymore. Instead, we raise an
normal error and set an error flag that signals to the top-level
functions that the value could not be converted and should not be
propagated to the Bro core. This sadly makes the already messy code even
more messy - but since errors can happen in deeply nested data
structures, the alternative (catching the error at every possible
location and then trying to clean up there instead of recursively
deleting the data that cannot be used later) is much worse.
Addresses BIT-1199
* origin/fastpath:
Crashing bug in WriterBackend when deserializing WriterInfo where config is present. Testcase crashes on unpatched versions of Bro.
Fix wrong value test in WriterBackend. Found by Aaron Eppert (aeppert@gmail.com)
is present. Testcase crashes on unpatched versions of Bro.
Found by Aaron Eppert <aeppert@gmail.com>.
This (probably) fixes the crash issue with sqlite a few people have
reported on the mailing list in the past.
- Any files where the total size was below the size of the
default bof_buffer size couldn't have stream analyzers successfully
attached because the bof_buffer never reached the full size
and was never flushed. This branch explicitly marks the buf_buffer
as full and flushes it when the file is being removed.
with a MIME type.
Whenever that MIME is detected, Bro will now automatically activate
the analyzer. The interface mimics how well-known ports are defined
for protocol analyzers.
This isn't actually used by any existing file analyzer (because we
don't have any yet that target a specific file format), but there's a
test making sure it works.
This prevents the worker nodes from crashing, when request_key is used
in cluster mode and called on the worker and the manager nodes (i.e. when
a non-cluster-aware script is used).
Addresses BIT-1177
- It's not *exactly* ISO 8601 which doesn't seem to support
subseconds, but subseconds are very important to us and
most things that support ISO8601 seem to also support subseconds
in the way I'm implemented it.
- Formatters have been abstracted similarly to readers and writers now.
- The Ascii writer has a new option for writing out logs as JSON.
- The Ascii writer now has all options availble as per-filter
options as well as global.
- First:
Due to architectural constraints, it is very hard for the
input framework to handle optional records. For an optional record,
either the whole record has to be missing, or all non-optional elements
of the record have to be defined. This information is not available
to input readers after the records have been unrolled into the threading
types.
Behavior so far was to treat optional records like they are non-optional,
without warning. The patch changes this behavior to emit an error on stream-
creation (during type-checking) and refusing to open the file. I think this
is a better idea - the behavior so far was undocumented and unintuitive.
- Second:
For table and event streams, reader backend creation was done very early,
before actually checking if all arguments are valid. Initialization is moved
after the checks now - this makes a number of delete statements unnecessary.
Also - I suspect threads of failed input reader instances were not deleted
until shutdown
- Third:
Add a couple more consistency checks, e.g. checking if the destination value
of a table has the same type as we need. We did not check everything in all
instances, instead we just assigned the things without caring (which works,
but is not really desirable).
This change also exposed a few bugs in other testcases where table definitions
were wrong (did not respect $want_record)
- Fourth:
Improve error messages and write testcases for all error messages (I think).
Sorry for this - I noticed that I named this option quite unfortunately
while writing the documentation.
The patch also removes the dbname configuration option from the sqlite
input reader - it was not used there at all anymore (and I did not notice
that).
* origin/topic/bernhard/hyperloglog: (32 commits)
add clustered leak test for hll. No issues.
make gcc happy
(hopefully) fix refcounting problem in hll/bloom-filter opaque vals. Thanks Robin.
re-use same hash class for all add operations
get hll ready for merging
and forgot a file...
adapt to new structure
fix opaqueval-related memleak.
make it compile on case-sensitive file systems and fix warnings
make error rate configureable
add persistence test not using predetermined random seeds.
update cluster test to also use hll
persistence really works.
well, with this commit synchronizing the data structure should work.. ...if we had consistent hashing.
and also serialize the other things we need
ok, this bug was hard to find.
serialization compiles.
change plugin after feedback of seth
Forgot a file. Again. Like always. Basically.
do away with old file.
...
- Generally increased the time allowed before they timeout.
- For tests w/ a clear termination condition (most of them), made
timeouts result in a test failure.
- Seemed to be a race in some cases between tests generating output and
the input reader stream getting removed/closed, so moved stream removal
closer to termination time, when all output should be available.
- Primarily working around an issue that occurs when threads
concurrently create pipes and fork a child process. See comment in
code...
- Other minor cleanup of the code: making sure the child process calls
_exit() versus exit(), limits itself to few select system calls before
the exec(), and closes more unused file descriptors.
* origin/topic/seth/sumstats-updates:
Still fixing bugs in sumstats updated api cluster support.
Hopefully fix the SumStats cluster support.
Fix the SumStats top-k plugin and test.
Updates for SumStats API to deal with high memory stats.
Beginning rework of SumStats API.
Tiny fix to account for missing str field (not sure how this happens yet)
Add server samples to SSH bruteforce detection.
Fix a reporter message in sumstats.
SumStats changes to how thresholding works to simplify and reduce memory use.
More adjustments to try and correct SumStats memory use.
Hopefully fixing a strange error.
Large update for the SumStats framework.
- Do stream mode for commands done by exec module, it seems important
in some cases (e.g. ensure requested stdin is fully written).
- For cases where the raw input reader knows the child process has been
reaped, set the childpid member to a sentinel value to indicate such
so we don't later think we should kill it or wait on it anymore.
- More error checking on dup2/close calls. Set sentinel values when
closing ends of pipes to prevent double closing a fd.
- Signal flag not set when raw input reader's child exits as a result
of a signal. Left out a test for this -- might be portability issues
(e.g. Ubuntu seems to do things different regarding the exit code and
also is printing "Killed" to stderr where other platforms don't).
* origin/topic/seth/faf-updates: (27 commits)
Undoing the FTP tests I updated earlier.
Update the last two btest FAF tests.
File analysis fixes and test updates.
Fix a bug with getting analyzer tags.
A few test updates.
Some tests work now (at least they all don't fail anymore!)
Forgot a file.
Added protocol description functions that provide a super compressed log representation.
Fix a bug where orig file information in http wasn't working right.
Added mime types to http.log
Clean up queued but unused file_over_new_connections event args.
Add jar files to the default MHR lookups.
Adding CAB files for MHR checking.
Improve malware hash registry script.
Fix a small issue with finding smtp entities.
Added support for files to the notice framework.
Make the custom libmagic database a git submodule.
Add an is_orig parameter to file_over_new_connection event.
Make magic for emitting application/msword mime type less strict.
Disable more libmagic builtin checks that override the magic database.
...
Conflicts:
doc/scripts/DocSourcesList.cmake
scripts/base/init-bare.bro
scripts/test-all-policy.bro
testing/btest/Baseline/coverage.bare-load-baseline/canonified_loaded_scripts.log
testing/btest/Baseline/coverage.default-load-baseline/canonified_loaded_scripts.log
- Several places were just using old variable names or not loading
scripts correctly after they'd been renamed/moved.
- Revert/adjust a change in how HTTP file handles are generated that
broke partial content responses.
- Turn some libmagic builtin checks back on; seems some are actually
useful (e.g. text detection seems to be a builtin). The rule going
forward probably will be only to turn off a builtin if we confirm it
causes issues.
- Removed some tests that are redundant or not necessary anymore because
the generic file analysis tests cover them.
- A couple FTP tests still fail that I think need an actual solution via
script changes.