[CASSANDRA-2319] Promote row index - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Normal
Resolution: Fixed
Fix Version/s: 1.2.0 beta 1
Component/s: None
Labels:
- index
- timeseries

Description

The row index contains entries for configurably sized blocks of a wide row. For a row of appreciable size, the row index ends up directing the third seek (1. index, 2. row index, 3. content) to nearby the first column of a scan.

Since the row index is always used for wide rows, and since it contains information that tells us whether or not the 3rd seek is necessary (the column range or name we are trying to slice may not exist in a given sstable), promoting the row index into the sstable index would allow us to drop the maximum number of seeks for wide rows back to 2, and, more importantly, would allow sstables to be eliminated using only the index.

An example usecase that benefits greatly from this change is time series data in wide rows, where data is appended to the beginning or end of the row. Our existing compaction strategy gets lucky and clusters the oldest data in the oldest sstables: for queries to recently appended data, we would be able to eliminate wide rows using only the sstable index, rather than needing to seek into the data file to determine that it isn't interesting. For narrow rows, this change would have no effect, as they will not reach the threshold for indexing anyway.

A first cut design for this change would look very similar to the file format design proposed on #674: http://wiki.apache.org/cassandra/FileFormatDesignDoc: row keys clustered, column names clustered, and offsets clustered and delta encoded.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

2319-v1.tgz
26/May/11 01:01
64 kB
Stu Hood
2319-v2.tgz
24/Jun/11 03:43
62 kB
Stu Hood
promotion.pdf
15/Apr/11 04:43
25 kB
Stu Hood
version-f.txt
15/Apr/11 04:43
2 kB
Stu Hood
version-g.txt
15/Apr/11 04:43
2 kB
Stu Hood
version-g-lzf.txt
15/Apr/11 04:43
2 kB
Stu Hood

Issue Links

is blocked by

CASSANDRA-2398 Type specific compression

Resolved

CASSANDRA-674 New SSTable Format

Resolved

CASSANDRA-2336 Extract SSTable.Builder/IndexWriter

Resolved

Activity

People

Assignee:: Sylvain Lebresne

Reporter:: Stu Hood

Authors:: Sylvain Lebresne

Reviewers:: Stu Hood

Votes:: 1 Vote for this issue

Watchers:: 8 Start watching this issue

Dates

Created:: 12/Mar/11 19:44

Updated:: 16/Apr/19 09:33

Resolved:: 13/Mar/12 13:28