Description
I have noticed two bug report on "failed assert `nr == sizeof(uint64_t)`" in EventNotify::signal():
[TrafficServer] using root directory '/usr/local/trafficserver-4.1.0' FATAL: EventNotify.cc:73: failed assert `nr == sizeof(uint64_t)` /usr/local/trafficserver-4.1.0/bin/traffic_server - STACK TRACE: /usr/local/trafficserver-4.1.0/lib/libtsutil.so.4(+0x14b57)[0x2b71a2d84b57] /usr/local/trafficserver-4.1.0/lib/libtsutil.so.4(+0x13d7f)[0x2b71a2d83d7f] /usr/local/trafficserver-4.1.0/lib/libtsutil.so.4(+0x1c32e)[0x2b71a2d8c32e] /usr/local/trafficserver-4.1.0/bin/traffic_server(LogObject::_checkout_write(unsigned long*, unsigned long)+0x1f5)[0x5adf25] /usr/local/trafficserver-4.1.0/bin/traffic_server(LogObjectManager::check_buffer_expiration(long)+0x7b)[0x5afbfb] /usr/local/trafficserver-4.1.0/bin/traffic_server(Log::periodic_tasks(long)+0xe2)[0x595c92] /usr/local/trafficserver-4.1.0/bin/traffic_server(Log::flush_thread_main(void*)+0x28e)[0x596cee] /usr/local/trafficserver-4.1.0/bin/traffic_server[0x59abcd] /usr/local/trafficserver-4.1.0/bin/traffic_server(EThread::execute()+0x1159)[0x6ae139] /usr/local/trafficserver-4.1.0/bin/traffic_server[0x6ab93a] /lib/x86_64-linux-gnu/libpthread.so.0(+0x6b50)[0x2b71a479cb50] /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x2b71a5430a7d] [E. Mgmt] log ==> [TrafficManager] using root directory '/usr/local/trafficserver-4.1.0' [TrafficServer] using root directory '/usr/local/trafficserver-4.1.0'
In order to fix this issue, I cook a patch:
1) Use nonblock eventfd, so that we can tolerate write() failed with errno EAGAIN – which is acceptable as the signal receiver will be notified eventually in this case.
2) After using nonblock eventfd, read() will not block in wait(). So I use epoll_wait() to implement block behavior, just like timedwait().
3) nonblock eventfd can fix a potential problem: if receiver didn't read() data immediately, senders might block in write().
Please test this patch, any feedback is welcome.