[CASSANDRA-12615] Improve LCS compaction concurrency during L0->L1 compaction - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Open
Priority: Normal
Resolution: Unresolved
Fix Version/s: None
Component/s: Local/Compaction
Labels:
- lcs

Description

I've done multiple experiments with compaction-stress at 100GB, 200GB, 400GB and 600GB levels. These scenarios share a common pattern: at the beginning of the compaction they all overwhelm L0 with a lot (hundreds to thousands) of 128MB SSTables. One common observation I noticed from visualizing the compaction.log files from these tests is that initially after some massive STCS-in-L0 activities (could take up to 40% of total compaction time), L0->L1 always takes a really long time which frequently involves all of the bigger L0 SSTables (the results of many STCS compactions earlier) and all of the 10 L1 SSTables, and the output covers almost the full data set. Since L0->L1 can only happen single-threaded, we often spend close to 40% of the total compaction time in this L0->L1 stage, and only after this first really long L0->L1 finishes and 100s or 1000s of SSTables land on L1, can concurrent compactions at higher levels resume (to move the thousands of L1 SSTables to higher levels). The attached snapshot demonstrates this observation.

The question is, if this L0->L1 compaction is so big and can only happen single-threaded, and ends up generating thousands of L1 SSTables, most of which will have to up-level later anyway (as L1 can accommodate at most 10 SSTables), why not start that L1+ up-level earlier, i.e. before this L0->L1 compaction finishes.

I can think of a few approaches: 1) break L0->L1 into smaller chunks if it realizes that the output of such L0->L1 compaction is going to far exceed the capacity of L1, this will allow each L0->L1 to finish sooner, and have the resulting L1 SSTables to be able to participate in higher up-level activities; 2) still treating the full L0->L1 as one big compaction session, but making the intermediate results (once the number of L1 SSTable output exceeds the L1 capacity) available for higher up-level activities.

If we can somehow leverage more threads during this massive L0->L1 phase, we can save close to 40% of the total compaction time when L0 is initially backlogged, which will be a great improvement to our LCS compaction throughput.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

L0_L1_inefficiency.jpg
06/Sep/16 05:05
226 kB
Wei Deng

Issue Links

is related to

CASSANDRA-14011 Multi threaded L0 -> L1 compaction

Open

Activity

People

Assignee:: Unassigned

Reporter:: Wei Deng

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 06/Sep/16 05:05

Updated:: 27/Dec/20 15:24