Details
-
Improvement
-
Status: Resolved
-
Normal
-
Resolution: Duplicate
-
None
-
None
-
None
-
Debian Linux, 3 node cluster with RF 3, 8GB heap on 32GB machines
Description
When doing a range query on a row with a lot of tombstones, these can quickly add up and use too much heap, even if we specify a column count of 2 as the tombstones can be between those two live columns. From the client API side it can do nothing to prevent this from happening since there is no limit that can be specified for the number of tombstones being collected.
I know that this looks like the "I'm using a row as a queue and building up a ton of tombstones" anti-pattern, but still Cassandra should be able to take better care of himself so as to prevent a DoS. I can imagine a lot of use cases that let users create and delete columns on a row.
I propose a simple safety valve that can act like this: "The client has asked me for X nodes, I've already collected X^Y nodes and still have not found X live nodes, I should just give up and return an exception". The Y would be the configurable parameter. Time taken per query or memory used could also be factors to take into consideration.
Attachments
Issue Links
- duplicates
-
CASSANDRA-6117 Avoid death-by-tombstone by default
- Resolved