util::safe_write() calls abort() in case of EAGAIN errors. This is
easily observed when starting clusters with 32 workers or more.
Add a custom write_message() function handling EAGAIN by retrying
after a small sleep. It's not clear a more complicated poll() would be
much better: The pipe might be ready for writing, but then our message
might not actually fit in, resulting in another EAGAIN error. And even
poll() would introduce blocking/sleeping code.
Take some precautions against the stem and the supervisor dead-locking
when both pipes are full by draining the other end on EAGAIN errors.
Closes#3043