[CASSANDRA-12940] Large compaction backlogs should slow down repairs - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Normal
Resolution: Duplicate
Fix Version/s: None
Component/s: None
Labels:
None

Description

Repairs cause a flood of small sstables. In some situations the small sstables come in so fast that it takes longer to commit the compaction transaction than it takes to stream in the tables. This will cause a buildup in sstables, and this buildup causes compaction to go even slower (see CASSANDRA-12764).

For a cluster of mine this means running into nodes with >100 loadavg, with tables that have 10k sstables. After the repair finishes the nodes go back to normal, but it takes a while and affects query latency a lot.

The compaction paths could probably be faster, though I'm more interested in making repairs wait for compaction. When we have a L0 with 10000+ tables, the repair path should probably wait a minute.

All I did was run 'nodetool repair' :

                SSTable count: 11755
                SSTables in each level: [11709/4, 23/10, 50, 0, 0, 0, 0, 0, 0]

`nodetool compactionstats' shows 17 pending tasks (seems a bit low) and `nodetool netstats' shows 1861 lines of text over 138 stream sessions.

Attachments

Issue Links

duplicates

CASSANDRA-10862 LCS repair: compact tables before making available in L0

Open

Activity

People

Assignee:: Unassigned

Reporter:: Tom van der Woerdt

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 21/Nov/16 21:28

Updated:: 16/Apr/19 09:30

Resolved:: 03/Jan/17 13:18