This change tracks the current offset (number of bytes fed into matchers)
on the top-level RuleEndpointState such that we can compute the relative ending
for matched texts individually.
Additionally, it adds the data_end_offset as a new optional parameter to
signature_match().
Add pattern_end_offset to signature_state
Update init-bare.zeek
Update RuleMatcher.cc
Update RuleMatcher.h
Update init-bare.zeek
clang format
clang format
clang format
Using Match Offsets List
Temp commit
This largely copies over Spicy's `.clang-format` configuration file. The
one place where we deviate is header include order since Zeek depends on
headers being included in a certain order.
The new hooks works similar to the existing `HookLoadFile` but,
additionally, allows the plugin to return a string that contains the
code to be used for the file being loaded. If the plugin does so, the
content of any actual file on disk will be ignored (in fact, there
doesn't even need to be a file on disk in that case). This works for
both Zeek scripts and signatures.
There's a new test that covers the new functionality, testing loading
both scripts and signatures from memory. I also manually tested that the
debugger integration works, but I don't see much of a way to add a
regression test for that part.
We keep the existing hook as well for backwards compatibility. We could
decide to deprecate it, but not sure that buys us much, so left that
out.
Closes#1757.
This (1) fixes an issue where signature files supplied on the command
line wouldn't pass through the hooks, and (2) prepares for allowing
hooks to supply the content of a signature file directly.
* origin/topic/timw/776-using-statements:
Remove 'using namespace std' from SerialTypes.h
Remove other using statements from headers
GH-776: Remove using statements added by PR 770
Includes small fixes in files that changed since the merge request was
made.
Also includes a few small indentation fixes.
* origin/topic/timw/nullptr:
The remaining nulls
plugin/probabilistic/zeekygen: Replace nulls with nullptr
file_analysis: Replace nulls with nullptr
analyzer: Replace nulls with nullptr
iosource/threading/input/logging: Replace nulls with nullptr
The big hitters:
Dict: Fills in four 4-byte holes in the structure. This shrinks Dictionary from 136 bytes to 114 bytes.
Desc: Fills in a 6-byte hole in the structure. This shrinks ODesc from 152 bytes to 144 bytes.
Frame: Moves and combines 4 bool variables from a few places into one single 4-byte block. This resolves all of the holes at once. This shrinks Frame from 216 bytes to 192 bytes and removes one cache line.
Func: Moves one int32_t variable to fill in a 4-byte hole. This shrinks Func from 112 bytes to 104 bytes.
ID: Moves two bool variables to fill in a 3-byte hole. This leaves behind a 1-byte hole, but removes a 6-byte pad from the end of the structure. This shrinks ID from 144 bytes to 136 bytes.
Other changes:
RuleHdrTest: Fills in one 4-byte hole in the structure. This shrinks RuleHdrTest from 248 bytes to 240 bytes.
RuleEndpointState: Moves one bool variable down in the structure to reduce a 7-byte hole. This unfortunately causes a 3-byte hole later in the structure but there’s no easy way to filll it in. This does shrink RuleEndpointState from 128 bytes to 120 bytes though.
ScannedFile: Moves two bool values to reduce a 4-byte hole by 2 bytes. This shrinks ScannedFile from 64 bytes to 56 bytes.
Brofiler: Moves one char value to reduce a 4-byte hole by 1 byte. This shrinks Brofiler from 96 bytes to 88 bytes and removes one cache line.
DbgBreakpoint: Moves some values around to fill in a 4-byte hole and reduce a second. A 2-byte hole still exists, but the structure shrinks from 632 bytes to 624 bytes. It’s possible on this one that one of the int32_t values could be an int16_t and remove the last 2-byte gap.
ParseLocationRec: Moves one int to fill in a 4-byte hole. This shrinks ParseLocationRec from 32 bytes to 24 bytes.
DebugCmdInfo: Moves one bool variable to shift a few others up. This results in a 6-byte pad at the end of the structure but removes a 7-byte hole in the middle. This shrinks DebugCmdInfo from 56 bytes to 48 bytes.
FragReassembler: Moves one variable down to fill in a 4-byte hole. This shrinks FragReassembler from 272 bytes to 264 bytes.
nb_dns_result: Moves ones uint32_t variable to fill in a 4-byte hole, also removing a 4-byte pad from the end of the structure. This shrinks nb_dns_result from 32 bytes to 24 bytes.
nb_dns_entry: Moves one short value to fill in a 2-byte hole, also removing a 6-byte hole. This shrinks nb_dns_entry from 1064 bytes to 1056 bytes.
The Zeek code base has very inconsistent #includes. Many sources
included a few headers, and those headers included other headers, and
in the end, nearly everything is included everywhere, so missing
#includes were never noticed. Another side effect was a lot of header
bloat which slows down the build.
First step to fix it: in each source file, its own header should be
included first to verify that each header's includes are correct, and
none is missing.
After adding the missing #includes, I replaced lots of #includes
inside headers with class forward declarations. In most headers,
object pointers are never referenced, so declaring the function
prototypes with forward-declared classes is just fine.
This patch speeds up the build by 19%, because each compilation unit
gets smaller. Here are the "time" numbers for a fresh build (with a
warm page cache but without ccache):
Before this patch:
3144.94user 161.63system 3:02.87elapsed 1808%CPU (0avgtext+0avgdata 2168608maxresident)k
760inputs+12008400outputs (1511major+57747204minor)pagefaults 0swaps
After this patch:
2565.17user 141.83system 2:25.46elapsed 1860%CPU (0avgtext+0avgdata 1489076maxresident)k
72576inputs+9130920outputs (1667major+49400430minor)pagefaults 0swaps
The full process hierarchy isn't set up yet, but these changes
help prepare by doing two things:
- Add a -j option to enable supervisor-mode. Currently, just a single
"stem" process gets forked early on to be used as the basis for
further forking into real cluster nodes.
- Separates the parsing of command-line options from their consumption.
i.e. need to parse whether we're in -j supervisor-mode before
modifying any global state since that would taint the "stem" process.
The new intermediate structure containing the parsed options may
also serve as a way to pass configuration info from "stem" to its
descendent cluster node processes.
The logic for initializing PIA endpoint matchers was previously
skipped if "there's no global rule matcher", and that's only true
when no signature files get loaded.
But when using `zeek -b`, some file-magic signatures still get loaded
by default, so the PIA endpoint matchers still get initialized even
though they don't need to be -- file-magic patterns play no part
in PIA.
For typical use-cases (not using the `-b` flag), this change won't
help any, but we do at least use `-b` often within the test suite.