Fix attempt for "internal error: unknown msg type 115 in Poll()"

Under remote communication overload conditions, the child->parent
chunked IO may start rejecting chunks if over the hard cap.  Some
messages are made of two chunks, accepting the first part, but rejecting
the second can put the parent in a bad state and the next two chunks it
reads are likely to cause the error.

This patch just removes the rejecting functionality completely and so
now relies solely on shutting down remote peer connections to help
alleviate temporary overload conditions. The
"chunked_io_buffer_soft_cap" script variable can now tune when this
shutting down starts happening and the default setting is now double
what it used to be.  For constant overload conditions, communication.log
should keep stating "queue to parent filling up; shutting down heaviest
connection".

An alternative to completely removing the hard cap rejection code could
be ensuring that messages that involve a pair of chunks can never have
the second chunk be rejected when attempting to write it.

Addresses BIT-1376
This commit is contained in:
Jon Siwek 2015-04-16 17:02:44 -05:00
parent 8789d7f527
commit effeaa5b13
6 changed files with 11 additions and 34 deletions

View file

@ -221,13 +221,6 @@ private:
// than BUFFER_SIZE.
static const uint32 FLAG_PARTIAL = 0x80000000;
// We report that we're filling up when there are more than this number
// of pending chunks.
static const uint32 MAX_BUFFERED_CHUNKS_SOFT = 400000;
// Maximum number of chunks we store in memory before rejecting writes.
static const uint32 MAX_BUFFERED_CHUNKS = 500000;
char* read_buffer;
uint32 read_len;
uint32 read_pos;
@ -275,8 +268,6 @@ public:
virtual void Stats(char* buffer, int length);
private:
// Maximum number of chunks we store in memory before rejecting writes.
static const uint32 MAX_BUFFERED_CHUNKS = 500000;
// Only returns true if all data has been read. If not, call
// it again with the same parameters as long as error is not