Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-8915

Improve MergeIterator performance

    XMLWordPrintableJSON

    Details

      Description

      The implementation of MergeIterator uses a priority queue and applies a pair of poll+add operations for every item in the resulting sequence. This is quite inefficient as poll necessarily applies at least log N comparisons (up to 2log N), and add often requires another log N, for example in the case where the inputs largely don't overlap (where N is the number of iterators being merged).

      This can easily be replaced with a simple custom structure that can perform replacement of the top of the queue in a single step, which will very often complete after a couple of comparisons and in the worst case scenarios will match the complexity of the current implementation.

      This should significantly improve merge performance for iterators with limited overlap (e.g. levelled compaction).

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                blambov Branimir Lambov
                Reporter:
                blambov Branimir Lambov
                Authors:
                Branimir Lambov
                Reviewers:
                Benedict Elliott Smith
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 0.2h
                  0.2h