[CASSANDRA-2062] Better control of iterator consumption - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Low
Resolution: Fixed
Fix Version/s: 1.0.0
Component/s: None
Labels:
None

Description

The core reason for this ticket is to gain control over the consumption of the lazy nested iterators in the read path.

We survive now because we write the size of the row at the front of the row (via some serious acrobatics at write time), which gives us hasNext() for rows for free. But it became apparent while working on the block-based format that hasNext() will not be cheap unless the current item has been consumed. "Consumption" of the row is easy, and blocks will be framed so that they can be very easily skipped, but you don't want to have to seek to the end of the row to answer hasNext, and then seek back to the beginning to consume the row, which is what CollatingIterator would have forced us to do.

While we're at it, we can also improve efficiency: for M iterators containing N total items, commons.collections.CollatingIterator performs a O(M*N) merge, and calls hasNext multiple times per returned value. We can do better.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

0004-CASSANDRA-2062-0004-Replace-ReducingIterator-for-lazy-.txt
14/Jun/11 23:09
7 kB
Stu Hood
0003-CASSANDRA-2062-0003-Replace-ReducingIterator-in-QueryF.txt
14/Jun/11 23:09
10 kB
Stu Hood
0002-CASSANDRA-2062-0002-Port-all-collating-consumers-to-Me.txt
14/Jun/11 23:09
36 kB
Stu Hood
0001-CASSANDRA-2062-0001-Improved-iterator-for-merging-sort.txt
14/Jun/11 23:09
12 kB
Stu Hood

Issue Links

blocks

CASSANDRA-674 New SSTable Format

Resolved

CASSANDRA-2629 Move key reads into SSTableIterators

Resolved

Activity

People

Assignee:: Stu Hood

Reporter:: Stu Hood

Authors:: Stu Hood

Reviewers:: Jonathan Ellis

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 27/Jan/11 09:03

Updated:: 16/Apr/19 09:33

Resolved:: 16/Jun/11 04:25