diff --git a/src/script_opt/CPP/README.md b/src/script_opt/CPP/README.md index 1a8d1e4d9d..1ba87ec6ea 100644 --- a/src/script_opt/CPP/README.md +++ b/src/script_opt/CPP/README.md @@ -54,6 +54,13 @@ at the beginning of `Compile.h`. Workflows --------- +_Before building Zeek_, see the first of the [_Known Issues_](#known-issues) +below regarding compilation times. If your aim is to exploration of the +functionality rather than production use, you might want to build Zeek +using `./configure --enable-debug`, which can reduce compilation times by +50x (!). Once you've built it, the following sketches how to create +and use compiled scripts. + The main code generated by the compiler is taken from `build/CPP-gen.cc`. An empty version of this is generated when first building Zeek. @@ -66,21 +73,17 @@ The following workflow assumes you are in the `build/` subdirectory: 1. `./src/zeek -O gen-C++ target.zeek` The generated code is written to -`CPP-gen-addl.h`. (This name is a reflection of some more complicated -features and probably should be changed.) The compiler will also produce -a file `CPP-hashes.dat`, for use by an advanced feature. -2. `mv CPP-gen-addl.h CPP-gen.cc` -3. `touch CPP-gen-addl.h` -(Needed because `CPP-gen.cc` -expects the file to exist, again in support of more complicated features.) -4. `ninja` or `make` to recompile Zeek -5. `./src/zeek -O use-C++ target.zeek` +`CPP-gen.cc`. The compiler will also produce +a file `CPP-hashes.dat`, for use by an advanced feature, and an +empty `CPP-gen-addl.h` file (same). +2. `ninja` or `make` to recompile Zeek +3. `./src/zeek -O use-C++ target.zeek` Executes with each function/hook/ event handler pulled in by `target.zeek` replaced with its compiled version. Instead of the last line above, you can use the following variants: -5. `./src/zeek -O report-C++ target.zeek` +3. `./src/zeek -O report-C++ target.zeek` For each function body in `target.zeek`, reports which ones have compiled-to-C++ bodies available, and also any compiled-to-C++ bodies present in the `zeek` binary that @@ -91,15 +94,21 @@ the `target.zeek` script. You can avoid this by replacing the first step with: 1. `./src/zeek -O gen-standalone-C++ target.zeek >target-stand-in.zeek` -and then continuing the next three steps. This option prints to _stdout_ a +(and then building as in the 2nd step above). +This option prints to _stdout_ a (very short) "stand-in" Zeek script that you can load using -`-O use-C++ target-stand-in.zeek` to activate the compiled `target.zeek` -without needing to include `target.zeek` in the invocation. +`target-stand-in.zeek` to activate the compiled `target.zeek` +without needing to include `target.zeek` in the invocation (nor +the `-O use-C++` option). After loading the stand-in script, +you can still access types and functions declared in `target.zeek`. Note: the implementation differences between `gen-C++` and `gen-standalone-C++` wound up being modest enough that it might make sense to just always provide the latter functionality, which it turns out does not introduce any additional constraints compared to the current `gen-C++` functionality. +On the other hand, it's possible (not yet established) that code created +using `gen-C++` can be made to compile significantly faster than +standalone code. There are additional workflows relating to running the test suite, which we document only briefly here as they're likely going to change or go away @@ -128,7 +137,7 @@ Both of these _append_ to any existing `CPP-gen-addl.h` file, providing a means for building it up to reflect a number of compilations. The `update-C++` and `add-C++` options help support different -ways of building the `btest` test suie. They were meant to enable doing so +ways of building the `btest` test suite. They were meant to enable doing so without requiring per-test-suite-element recompilations. However, experiences to date have found that trying to avoid pointwise compilations incurs additional headaches, so it's better to just bite off the cost of a large @@ -174,11 +183,6 @@ Known Issues Here we list various known issues with using the compiler:
-* Run-time error messages generally lack location information and information -about associated expressions/statements, making them hard to puzzle out. -This could be fixed, but would add execution overhead in passing around -the necessary strings / `Location` objects. - * Compilation of compiled code can be noticeably slow (if built using `./configure --enable-debug`) or hugely slow (if not), with the latter taking on the order of an hour on a beefy laptop. This slowness complicates @@ -186,6 +190,11 @@ CI/CD approaches for always running compiled code against the test suite when merging changes. It's not presently clear how feasible it is to speed this up. +* Run-time error messages generally lack location information and information +about associated expressions/statements, making them hard to puzzle out. +This could be fixed, but would add execution overhead in passing around +the necessary strings / `Location` objects. + * Subtle bugs can arise when compiling code that uses `@if` conditional compilation. The compiled code will not directly use the wrong instance of a script body (one that differs due to the `@if` conditional having a