Calling Error() in an input reader now automatically will disable the
reader and return a failure in the Update/Heartbeat calls.
Also adds more tests.
Addresses BIT-1181
This change introduces error events for Table and Event readers. Users
can now specify an event that is called when an info, warning, or error
is emitted by their input reader. This can, e.g., be used to raise
notices in case errors occur when reading an important input stream.
Example:
event error_event(desc: Input::TableDescription, msg: string, level: Reporter::Level)
{
...
}
event bro_init()
{
Input::add_table([$source="a", $error_ev=error_event, ...]);
}
For the moment, this converts all errors in the Asciiformatter into
warnings (to show that they are non-fatal) - the Reader itself also has
to throw an Error to show that a fatal error occurred and processing
will be abort.
It might be nicer to change this and require readers to mark fatal
errors as such when throwing them.
Addresses BIT-1181
The ascii reader now accepts \r\n newlines without complaining.
Furthermore, the reader was slightly rewritten in a more c++11-y way,
removing all raw pointers from the class.
Addresses BIT-1198
one can now add an option "offset" to the config map. Positive offsets
are interpreted to be from the beginning of the file, negative from the
end of the file (-1 is end of file).
Only works for raw reader in streaming or manual mode. Does not work
with executables.
Addresses BIT-985
I replaced a few strcmps with either calls to std::str.compare
or with the == operator of BroString.
Also changed two of the input framework tests that did not pass
anymore after the merge. The new SSH analyzer no longer loads the
scripts that let network time run, hence those tests failed because
updates were not propagated from the threads (that took a while
to find.)
* origin/topic/vladg/ssh: (25 commits)
SSH: Register analyzer for 22/tcp.
SSH: Add 22/tcp to likely_server_ports
SSH: Ignore encrypted packets by default.
SSH: Fix some edge-cases which created BinPAC exceptions
SSH: Add memleak btest
SSH: Update baselines
SSH: Added some more events for SSH2
SSH: Intel framework integration (PUBKEY_HASH)
Update baselines for new SSH analyzer.
Update SSH policy scripts with new events.
SSH: Add documentation
Refactoring ssh-protocol.pac:
SSH: Use the compression_algorithms const in another place.
Some cleanup and refactoring on SSH main.bro.
SSH: A bit of code cleanup.
Move SSH constants to consts.pac
SSH: Cleanup code style.
SSH: Fix some memleaks.
Refactored the SSH analyzer. Added supported for algorithm detection and more key exchange message types.
Add host key support for SSH1.
Add support for SSH1
Move SSH analyzer to new plugin architecture.
...
Conflicts:
scripts/base/protocols/ssh/main.bro
testing/btest/Baseline/core.print-bpf-filters/output2
testing/btest/Baseline/plugins.hooks/output
BIT-1344: #merged
into BroVals (especially enums)
Not we do not force an internal error anymore. Instead, we raise an
normal error and set an error flag that signals to the top-level
functions that the value could not be converted and should not be
propagated to the Bro core. This sadly makes the already messy code even
more messy - but since errors can happen in deeply nested data
structures, the alternative (catching the error at every possible
location and then trying to clean up there instead of recursively
deleting the data that cannot be used later) is much worse.
Addresses BIT-1199
- First:
Due to architectural constraints, it is very hard for the
input framework to handle optional records. For an optional record,
either the whole record has to be missing, or all non-optional elements
of the record have to be defined. This information is not available
to input readers after the records have been unrolled into the threading
types.
Behavior so far was to treat optional records like they are non-optional,
without warning. The patch changes this behavior to emit an error on stream-
creation (during type-checking) and refusing to open the file. I think this
is a better idea - the behavior so far was undocumented and unintuitive.
- Second:
For table and event streams, reader backend creation was done very early,
before actually checking if all arguments are valid. Initialization is moved
after the checks now - this makes a number of delete statements unnecessary.
Also - I suspect threads of failed input reader instances were not deleted
until shutdown
- Third:
Add a couple more consistency checks, e.g. checking if the destination value
of a table has the same type as we need. We did not check everything in all
instances, instead we just assigned the things without caring (which works,
but is not really desirable).
This change also exposed a few bugs in other testcases where table definitions
were wrong (did not respect $want_record)
- Fourth:
Improve error messages and write testcases for all error messages (I think).
Sorry for this - I noticed that I named this option quite unfortunately
while writing the documentation.
The patch also removes the dbname configuration option from the sqlite
input reader - it was not used there at all anymore (and I did not notice
that).
- Generally increased the time allowed before they timeout.
- For tests w/ a clear termination condition (most of them), made
timeouts result in a test failure.
- Seemed to be a race in some cases between tests generating output and
the input reader stream getting removed/closed, so moved stream removal
closer to termination time, when all output should be available.
- Primarily working around an issue that occurs when threads
concurrently create pipes and fork a child process. See comment in
code...
- Other minor cleanup of the code: making sure the child process calls
_exit() versus exit(), limits itself to few select system calls before
the exec(), and closes more unused file descriptors.
- Do stream mode for commands done by exec module, it seems important
in some cases (e.g. ensure requested stdin is fully written).
- For cases where the raw input reader knows the child process has been
reaped, set the childpid member to a sentinel value to indicate such
so we don't later think we should kill it or wait on it anymore.
- More error checking on dup2/close calls. Set sentinel values when
closing ends of pipes to prevent double closing a fd.
- Signal flag not set when raw input reader's child exits as a result
of a signal. Left out a test for this -- might be portability issues
(e.g. Ubuntu seems to do things different regarding the exit code and
also is printing "Killed" to stderr where other platforms don't).
logging: table does not contain all required columns (when extending
data structures)
input: table does not contain all required columns (when extending
data structure), wrong sql statement
* send end_of_data event for all kind of streams
* send process_finished event containing exit code of child process for executed programs
* move raw-tests to separate directory
* expose name of input stream to readers
* better handling of some error cases in raw reader
* new force_kill option for raw reader which SIGKILLs progesses on exit
The ordering of events how they arrive in the main loop is a bit peculiar at the moment.
The process_finished event arrives in scriptland before all of the other events, even though
it should be sent last. I have not yet fully figured that out.
Sadly there also seems to be another deadlock issue which I am currently
not really able to figure out - on shutdown sometimes (too often) the main
thread + all sqlite threads wait for semaphores or mutexes.
more cases.
It will now not only fire after table-reads have been completed,
but also after the last event of a whole-file-read (or whole-db-read, etc.).
The interface also has been extended a bit to allow readers to
directly fire the event should they so choose. This allows the
event to be fired in direct table-setting/event-sending modes,
which was previously not possible.
without a final \0 - which means that strings read by the input framework are
unusable by basically all internal functions (like to_count).
the basic test now also checks this.
Thanks at Sheharbano for noticing this.
* remotes/origin/topic/bernhard/input-warn-on-invalid-numbers:
...and another small change to error handling -> now errors in single lines do not kill processing, but simply ignore the line, log it, and continue.
Ok, this one was a little bit sneaky.
ok, this one might really be a bit too big for 2.1
* origin/fastpath:
Ok, this one is not really necessary for 2.1 and more of a nice-to-have
another small bug found while searching for something else...
Fix two little bugs:
sorry. the patch for the set_separator.
make set_separators different from , work for input framework.
Bug found bei Keith & Seth: input framework was not handling counts and ints out of 32-bit-range correctly.
Before this patch, empty values were not hashed at all. Which had the unfortunate side-effect
that e.g. the lines
TEST -
and
- TEST
have the same hash values. On re-reads that means that the change will
be ignored.
This is probably pretty academic, but this patch changes it and adds a testcase.
Output of the reread test changes due to re-ordering of the output (probably
due to the fact that the internal hash values are changed and thus transferred
in a different order)
Escaped ,'s in sets and vectors were unescaped before tokenization
Handling of zero-length-strings as last element in a set was broken (sets ending with a ,).
Hashing of lines just containing zero-length-strings was broken (now a \0 is appended to each
string before it is hashed - giving us a hash of something for a line just consisting of \0s.
This also allows to differentiate between vectors with varying numbers of zero-length-strings).