Description
We're encountering the following segfault (stack abridged).
This happens because the watcher hashtable has no locking, and is accessed concurrently from multiple threads:
- the thread doing zoo_aremove_watches, and
- the IO thread adding / firing watchers
We encountered this with zookeeper 3.5.8, but by code inspection the code appears the same in 3.6.
*** Aborted at 1594473472 (Unix time, try 'date -d @1594473472') ***
-
-
- Signal 11 (SIGSEGV) (0xae000000aa) received by PID 199 (pthread TID 0x7f1d64667700) (linux TID 1273) (code: address not mapped to object), stack trace: ***
@ 00007f1d98dfc8b3 folly::symbolizer::(anonymous namespace)::signalHandler(int, siginfo_t*, void*)
/src/folly/folly/experimental/symbolizer/SignalHandler.cpp:431
@ 00007f1d95e6c89f (unknown)
@ 00007f1d8f73de1e containsWatcher.part.3
/src/zookeeper/zookeeper-client/zookeeper-client-c/src/zk_hashtable.c:152
@ 00007f1d8f73e806 pathHasWatcher
/src/zookeeper/zookeeper-client/zookeeper-client-c/src/zk_hashtable.c:456
@ 00007f1d8f7382dd aremove_watches
/src/zookeeper/zookeeper-client/zookeeper-client-c/src/zookeeper.c:4260
@ 00007f1d8f738f82 zoo_aremove_watches
/src/zookeeper/zookeeper-client/zookeeper-client-c/src/zookeeper.c:5131
- Signal 11 (SIGSEGV) (0xae000000aa) received by PID 199 (pthread TID 0x7f1d64667700) (linux TID 1273) (code: address not mapped to object), stack trace: ***
-