Cassandra
  1. Cassandra
  2. CASSANDRA-5143

Safety valve on number of tombstones skipped on read path to prevent a full heap

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Duplicate
    • Fix Version/s: None
    • Component/s: Core
    • Labels:
      None
    • Environment:

      Debian Linux, 3 node cluster with RF 3, 8GB heap on 32GB machines

      Description

      When doing a range query on a row with a lot of tombstones, these can quickly add up and use too much heap, even if we specify a column count of 2 as the tombstones can be between those two live columns. From the client API side it can do nothing to prevent this from happening since there is no limit that can be specified for the number of tombstones being collected.

      I know that this looks like the "I'm using a row as a queue and building up a ton of tombstones" anti-pattern, but still Cassandra should be able to take better care of himself so as to prevent a DoS. I can imagine a lot of use cases that let users create and delete columns on a row.

      I propose a simple safety valve that can act like this: "The client has asked me for X nodes, I've already collected X^Y nodes and still have not found X live nodes, I should just give up and return an exception". The Y would be the configurable parameter. Time taken per query or memory used could also be factors to take into consideration.

        Issue Links

          Activity

          André Cruz created issue -
          André Cruz made changes -
          Field Original Value New Value
          Description When doing a range query on a row with a lot of tombstones, these can quickly add up and use too much heap, even if we specify a column count of 2 as the tombstones can be between those two live columns. From the client API side it can do nothing to prevent this from happening since there is no limit that can be specified for the number of tombstones being collected.

          I know that this looks like the "I'm using a row as a queue and building up a ton of tombstones" anti-pattern, but still Cassandra should be able to take better care of himself so as to prevent a DoS. I can imagine a lot of use cases that let users create and delete columns on a row.

          I propose a simple safety valve that can act like this: "The client has asked me for X nodes, I've already collected X^Y nodes and still have not found X live nodes, I should just give up". The Y would be the configurable parameter. Time taken per query or memory used could also be factors to take into consideration.
          When doing a range query on a row with a lot of tombstones, these can quickly add up and use too much heap, even if we specify a column count of 2 as the tombstones can be between those two live columns. From the client API side it can do nothing to prevent this from happening since there is no limit that can be specified for the number of tombstones being collected.

          I know that this looks like the "I'm using a row as a queue and building up a ton of tombstones" anti-pattern, but still Cassandra should be able to take better care of himself so as to prevent a DoS. I can imagine a lot of use cases that let users create and delete columns on a row.

          I propose a simple safety valve that can act like this: "The client has asked me for X nodes, I've already collected X^Y nodes and still have not found X live nodes, I should just give up and return an exception". The Y would be the configurable parameter. Time taken per query or memory used could also be factors to take into consideration.
          André Cruz made changes -
          Summary Safety valve on number of tombstones skipped on read path too prevent a full heap Safety valve on number of tombstones skipped on read path to prevent a full heap
          Gavin made changes -
          Workflow no-reopen-closed, patch-avail [ 12745419 ] patch-available, re-open possible [ 12753759 ]
          Gavin made changes -
          Workflow patch-available, re-open possible [ 12753759 ] reopen-resolved, no closed status, patch-avail, testing [ 12758945 ]
          Jonathan Ellis made changes -
          Link This issue duplicates CASSANDRA-6117 [ CASSANDRA-6117 ]
          Jonathan Ellis made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Resolution Duplicate [ 3 ]
          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Open Open Resolved Resolved
          263d 19h 47m 1 Jonathan Ellis 01/Oct/13 15:35

            People

            • Assignee:
              Unassigned
              Reporter:
              André Cruz
            • Votes:
              2 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development