Changing what's escaped when printing.

With this patch the model is:

    - "print" cleans the data so that non-printable characters get
      escaped. This is not necessarily reversible.

    - to print in a reversible way, one can go through
      escape_string(); this escapes backslashes as well to make the
      decoding non-ambigious.

    - Logging always escapes similar to escape_string(), making it
      reversible.

Compared to master, we also change the escaping as follows:

    - We now only escape with "\xXX", no more "^X" or "\0". Exception:
      backslashes.

    - We escape backlashes as "\\".

    - There's no "alternative" output style anymore, i.e., fmt() '%A'
      qualifier is gone.

Baselines in testing/btest are updated, external tests not yet.

Addresses BIT-1333.
This commit is contained in:
Robin Sommer 2015-04-15 09:59:09 -07:00
parent e41c623ad0
commit 7344052b50
66 changed files with 397 additions and 349 deletions

View file

@ -75,21 +75,17 @@ public:
enum render_style {
ESC_NONE = 0,
//ESC_NULL = (1 << 0), // 0 -> "\0"
//ESC_DEL = (1 << 1), // DEL -> "^?"
//ESC_LOW = (1 << 2), // values <= 26 mapped into "^[A-Z]"
ESC_ESC = (1 << 3), // '\' -> "\\"
ESC_QUOT = (1 << 4), // '"' -> "\"", ''' -> "\'"
ESC_HEX = (1 << 5), // Not in [32, 126]? -> "%XX"
ESC_DOT = (1 << 6), // Not in [32, 126]? -> "."
ESC_ESC = (1 << 1), // '\' -> "\\"
ESC_QUOT = (1 << 2), // '"' -> "\"", ''' -> "\'"
ESC_HEX = (1 << 3), // Not in [32, 126]? -> "\xXX"
ESC_DOT = (1 << 4), // Not in [32, 126]? -> "."
// For serialization: '<string len> <string>'
ESC_SER = (1 << 7),
};
static const int EXPANDED_STRING = // the original style
ESC_ESC | ESC_HEX;
ESC_HEX;
static const int BRO_STRING_LITERAL = // as in a Bro string literal
ESC_ESC | ESC_QUOT | ESC_HEX;