Details
Description
I'm re-purposing this issue to add a heuristic as to when to SEEK and when to SKIP Cells. This has come up in various issues, and I think I have a way to finally fix this now. HBASE-9778, HBASE-12311, and friends are related.
— Old description —
This is a continuation of HBASE-9778.
We've seen a scenario of a very slow scan over a region using a timerange that happens to fall after the ts of any Cell in the region.
Turns out we spend a lot of time seeking.
Tested with a 5 column table, and the scan is 5x faster when the timerange falls before all Cells' ts.
We can use the lookahead hint introduced in HBASE-9778 to do opportunistic SKIPing before we actually seek.
Attachments
Attachments
Issue Links
- is related to
-
HBASE-9778 Add hint to ExplicitColumnTracker to avoid seeking
- Closed
-
PHOENIX-1731 Add getNextIndexedKey() to IndexHalfStoreFileReader and FilteredKeyValueScanner
- Closed
-
HBASE-9769 Improve performance of a Scanner with explicit column list when rows are small/medium size
- Closed
- relates to
-
HBASE-19034 Implement "optimize SEEK to SKIP" in storefile scanner
- Closed
-
HBASE-17958 Avoid passing unexpected cell to ScanQueryMatcher when optimize SEEK to SKIP
- Closed
-
PHOENIX-1731 Add getNextIndexedKey() to IndexHalfStoreFileReader and FilteredKeyValueScanner
- Closed