Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-6446

Faster range tombstones on wide partitions

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Normal
    • Resolution: Fixed
    • Fix Version/s: 2.1 beta2
    • Component/s: None
    • Labels:
      None

      Description

      Having wide CQL rows (~1M in single partition) and after deleting some of them, we found inefficiencies in handling of range tombstones on both write and read paths.

      I attached 2 patches here, one for write path (RangeTombstonesWriteOptimization.diff) and another on read (RangeTombstonesReadOptimization.diff).

      On write path, when you have some CQL rows deletions by primary key, each of deletion is represented by range tombstone. On put of this tombstone to memtable the original code takes all columns from memtable from partition and checks DeletionInfo.isDeleted by brute for loop to decide, should this column stay in memtable or it was deleted by new tombstone. Needless to say, more columns you have on partition the slower deletions you have heating your CPU with brute range tombstones check.
      The RangeTombstonesWriteOptimization.diff patch for partitions with more than 10000 columns loops by tombstones instead and checks existance of columns for each of them. Also it copies of whole memtable range tombstone list only if there are changes to be made there (original code copies range tombstone list on every write).

      On read path, original code scans whole range tombstone list of a partition to match sstable columns to their range tomstones. The RangeTombstonesReadOptimization.diff patch scans only necessary range of tombstones, according to filter used for read.

        Attachments

        1. RangeTombstonesReadOptimization.diff
          17 kB
          Oleg Anastasyev
        2. RangeTombstonesWriteOptimization.diff
          4 kB
          Oleg Anastasyev
        3. 6446-Read-patch-v3.txt
          24 kB
          Sylvain Lebresne
        4. 6446-write-path-v3.txt
          5 kB
          Sylvain Lebresne

          Activity

            People

            • Assignee:
              m0nstermind Oleg Anastasyev
              Reporter:
              m0nstermind Oleg Anastasyev
              Authors:
              Oleg Anastasyev
              Reviewers:
              Sylvain Lebresne
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: