* origin/topic/dev/non-ascii-logging:
Removed Policy Script for UTF-8 Logs
Commented out UTF-8 Script in Test All Policy
Minor Style Tweak
Use getNumBytesForUTF8 method to determine number of bytes
Added Jon's test cases as unit tests
Prioritizes escaping predefined Escape Sequences over Unescaping UTF-8 Sequences
Added additional check to confirm anything unescaping is a multibyte UTF-8 sequence, addressing the test case Jon brought up
Added optional script and redef bool to enable utf-8 in ASCII logs
Initial Commit, removed std::isprint check to escape
Made minor code format and logic adjustments during merge.
This commit marks (hopefully) ever one-parameter constructor as explicit.
It also uses override in (hopefully) all circumstances where a virtual
method is overridden.
There are a very few other minor changes - most of them were necessary
to get everything to compile (like one additional constructor). In one
case I changed an implicit operation to an explicit string conversion -
I think the automatically chosen conversion was much more convoluted.
This took longer than I want to admit but not as long as I feared :)
With this patch the model is:
- "print" cleans the data so that non-printable characters get
escaped. This is not necessarily reversible.
- to print in a reversible way, one can go through
escape_string(); this escapes backslashes as well to make the
decoding non-ambigious.
- Logging always escapes similar to escape_string(), making it
reversible.
Compared to master, we also change the escaping as follows:
- We now only escape with "\xXX", no more "^X" or "\0". Exception:
backslashes.
- We escape backlashes as "\\".
- There's no "alternative" output style anymore, i.e., fmt() '%A'
qualifier is gone.
Baselines in testing/btest are updated, external tests not yet.
Addresses BIT-1333.
* origin/topic/bernhard/input-logging-commmon-functions:
add the last of Robins suggestions (separate info-struct for constructors).
port memory leak fix from master
harmonize function naming
move AsciiInputOutput over to threading
and thinking about it, ascii-io doesn't need the separator
change constructors
and factor stuff out the input framework too.
factor out ascii input/output.
std::string accessors to escape_sequence functionality
intermediate commit - it has been over a month since I touched this...
I cleaned up the AsciiInputOutput class somewhat, including renaming
it to AsciiFormatter, renaming some of its methods, and turning the
static methods into members for consistency.
Closes#929.
First step - factored out everything the logging classes
use ( so only output ).
Moved the script-level configuration to logging/main,
and made the individual writers just refer to it -
no idea if this is good design. It works. But I am happy
about opinions :)
Next step - add support for input...
As we can't use the IPAddr class (because it's not thread-safe), this
involved a bit manual address manipulation and also shuffling some
things around a bit.
Not fully working yet, the tests for remote logging still fail.
pass yet.
Changes:
- Gave IPAddress/IPPrefix methods AsString() so that one doesn't need
to cast to get a string represenation.
- Val::AsAddr()/AsSubnet() return references rather than pointers. I
find that more intuitive.
- ODesc/Serializer/SerializationFormat get methods to support
IPAddress/IPPrefix directly.
- Reformatted the comments in IPAddr.h from /// to /** style.
- Given IPPrefix a Contains() method.
- A bit of cleanup.
When using a `print` statement to write to a file that has raw output
enabled, NUL characters in string are no longer interpreted into "\0",
no newline is appended afterwards, and each argument to `print` is
written to the file without any additional separation.
(Re)Assigning to identifiers with the &raw_output attribute should also
now correctly apply the attribute to the file value being assigned.
Note that the write_file BiF should already be capable of raw string
data to a file, expect it bypasses the print_hook event.
Addresses #474
And (to be consistent with current conventions for reST documentation)
update places in the auto-documentation-generation framework
where tabs were used in the generated reST.