Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-8920

Optimise sequential overlap visitation for checking tombstone retention in compaction

    XMLWordPrintableJSON

Details

    Description

      The IntervalTree only maps partition keys. Since a majority of users deploy a hashed partitioner the work is mostly wasted, since they will be evenly distributed across the full token range owned by the node - and in some cases it is a significant amount of work. We can perform a corroboration against the file bounds if we get a BF match as a sanity check if we like, but performing an IntervalTree search is significantly more expensive (esp. once murmur hash calculation memoization goes mainstream).

      In LCS, the keys are bounded, to it might appear that it would help, but in this scenario we only compact against like bounds, so again it is not helpful.

      With a ByteOrderedPartitioner it could potentially be of use, but this is sufficiently rare to not optimise for IMO.

      Attachments

        1. 8920.txt
          2 kB
          Benedict Elliott Smith

        Activity

          People

            benedict Benedict Elliott Smith
            benedict Benedict Elliott Smith
            Benedict Elliott Smith
            Marcus Eriksson
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: