Uploaded image for project: 'Apache Cassandra'
  1. Apache Cassandra
  2. CASSANDRA-1046

optimize Memtable.getSliceIterator

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Normal
    • Resolution: Fixed
    • 0.7 beta 1
    • None
    • None

    Description

      As reported by James Golick, about 30% of the time in a read is spent in SliceQueryFilter.getMemColumnIterator, virtually all of which is in ConcurrentSkipListMap$Values.toArrray().

      I wrote on the ML:

      Besides the UUID optimization you posted, we should do an audit of ColumnFamily.getSortedColumns and replace with iteration where possible (in this case, we'd be left with one copy of most of the columns, but that's better than two).

      We can get rid of the other copy by fixing the logic in Memtable.getSliceIterator, which says "copy all the columns, so we can do a binary search on them to find where to start," but since columns are natively in sorted order we could just use an iterator and a while loo

      Attachments

        1. 0001-trunk-cassandra-1046.patch
          5 kB
          Matthew F. Dennis
        2. insertarator.py
          2 kB
          Matthew F. Dennis
        3. readarator.py
          2 kB
          Matthew F. Dennis

        Activity

          People

            mdennis Matthew F. Dennis
            jbellis Jonathan Ellis
            Matthew F. Dennis
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: