Details
-
Bug
-
Status: Resolved
-
Normal
-
Resolution: Fixed
-
None
-
None
-
Normal
Description
Gossip heartbeat really needs to run every 1s or other nodes may mark us down. But gossip currently shares an executor thread with other tasks.
I see at least two of these could cause blocking: hint cleanup post-delivery and flush-expired-memtables, both of which call forceFlush which will block if the flush queue + threads are full.
We've run into this before (CASSANDRA-2253); we should move Gossip back to its own dedicated executor or it will keep happening whenever someone accidentally puts something on the "shared" executor that can block.