[CASSANDRA-5569] Every stream operation requires checking indexes in every SSTable - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Low
Resolution: Fixed
Fix Version/s: 1.2.6
Component/s: None
Labels:
- streaming

Severity:
Low

Description

It looks like there's a streaming performance issue when leveled compaction and vnodes get together. To get the candidate set of chunks to stream, the streaming system gets references to every SSTable for a CF. This is probably a perfectly reasonable assumption for non-vnode cases, because the data being streamed is likely distributed across the full SSTable set. This is also probably a perfectly reasonable assumption for size-tiered compaction, because the data is, again, likely distributed across the full SSTable set. However, for each vnode repair performed on LCS CF's, this scan across potentially tens of thousands of SSTables is wasteful considering that only a small percentage of them will actually have data for a given range.

This manifested itself as "hanging" repair operations with tasks backing up on the MiscStage thread pool.

The attached patch changes the streaming code so that for a given range, only SSTables for the requested range are checked to be included in streaming.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

5569.txt
14/May/13 23:46
2 kB
Rick Branson
5569-v2.txt
15/May/13 04:57
8 kB
Rick Branson
5569-v3.txt
15/May/13 16:05
10 kB
Rick Branson

Activity

People

Assignee:: Rick Branson

Reporter:: Rick Branson

Authors:: Rick Branson

Reviewers:: Yuki Morishita

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 14/May/13 23:46

Updated:: 16/Apr/19 09:32

Resolved:: 16/May/13 16:46