Details
-
Improvement
-
Status: Resolved
-
Low
-
Resolution: Won't Fix
-
None
-
None
Description
Currently, most BlockingQueues in cassandra are creating without any limits (execution stages) or with limits high enough to consume gigabytes of heap (PeriodicCommitLogExecutorService). I have observed many cases where a single unresponsive node can bring down entire cluster because others accumulate huge backlogs of operations.
We need to make sure each queue is configurable through a yaml entry or a system property and defaults are chosen so that any given queue doesn't consume more than 100M of heap. I have successfully tested that adding these limits makes cluster resistant to heavy load or a bad node.
Attachments
Issue Links
- is related to
-
CASSANDRA-9318 Bound the number of in-flight requests at the coordinator
- Resolved