Details
Description
In libzookeeper_mt, if your process is going rather slowly (such as when running it in Valgrind's Memcheck) or you are using gdb with breakpoints, you can occasionally get SIGPIPE when trying to send a message to the cluster. For example:
==12788==
==12788== Process terminating with default action of signal 13 (SIGPIPE)
==12788== at 0x3F5180DE91: send (in /lib64/libpthread-2.5.so)
==12788== by 0x7F060AA: ??? (in /usr/lib64/libzookeeper_mt.so.2.0.0)
==12788== by 0x7F06E5B: zookeeper_process (in /usr/lib64/libzookeeper_mt.so.2.0.0)
==12788== by 0x7F0D38E: ??? (in /usr/lib64/libzookeeper_mt.so.2.0.0)
==12788== by 0x3F5180673C: start_thread (in /lib64/libpthread-2.5.so)
==12788== by 0x3F50CD3F6C: clone (in /lib64/libc-2.5.so)
==12788==
This is probably not the behavior we would like, since we handle server disconnections after a failed call to send. To fix this, there are a few options we could use. For BSD environments, we can tell a socket to never send SIGPIPE with send using setsockopt:
setsockopt(sd, SOL_SOCKET, SO_NOSIGPIPE, (void *)&set, sizeof(int));
For Linux environments, we can add a MSG_NOSIGNAL flag to every send call that says to not send SIGPIPE on a bad file descriptor.
For more information, see: http://stackoverflow.com/questions/108183/how-to-prevent-sigpipes-or-handle-them-properly