Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-8906

Experiment with optimizing partition merging when we can prove that some sources don't overlap

    XMLWordPrintableJSON

Details

    Description

      When we merge a partition from two sources and it turns out that those 2 sources don't overlap for that partition, we still end up doing one comparison by row in the first source. However, if we can prove that the 2 sources don't overlap, for example by using the sstable min/max clustering values that we store, we could speed this up. Note that it practice it's little bit more hairy because we need to deal with N sources, but that's probably not too hard either.

      I'll note that using the sstable min/max clustering values is not terribly precise. We could do better if we were to push the same reasoning inside the merge iterator, by for instance using the sstable per-partition index, which can in theory tell use things like "don't bother comparing rows until the end of this row block". This is quite a bit more involved though so maybe note worth the complexity.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              slebresne Sylvain Lebresne
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated: