Uploaded image for project: 'Apache Cassandra'
  1. Apache Cassandra
  2. CASSANDRA-2062

Better control of iterator consumption

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Low
    • Resolution: Fixed
    • 1.0.0
    • None
    • None

    Description

      The core reason for this ticket is to gain control over the consumption of the lazy nested iterators in the read path.

      We survive now because we write the size of the row at the front of the row (via some serious acrobatics at write time), which gives us hasNext() for rows for free. But it became apparent while working on the block-based format that hasNext() will not be cheap unless the current item has been consumed. "Consumption" of the row is easy, and blocks will be framed so that they can be very easily skipped, but you don't want to have to seek to the end of the row to answer hasNext, and then seek back to the beginning to consume the row, which is what CollatingIterator would have forced us to do.

      While we're at it, we can also improve efficiency: for M iterators containing N total items, commons.collections.CollatingIterator performs a O(M*N) merge, and calls hasNext multiple times per returned value. We can do better.

      Attachments

        Issue Links

          Activity

            People

              stuhood Stu Hood
              stuhood Stu Hood
              Stu Hood
              Jonathan Ellis
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: