[CASSANDRA-4316] Compaction Throttle too bursty with large rows - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Normal
Resolution: Fixed
Fix Version/s: 1.2.5
Component/s: None
Labels:
- qa-resolved

Description

In org.apache.cassandra.db.compaction.CompactionIterable the check for compaction throttling occurs once every 1000 rows. In our workload this is much too large as we have many large rows (16 - 100 MB).

With a 100 MB row, about 100 GB is read (and possibly written) before the compaction throttle sleeps. This causes bursts of essentially unthrottled compaction IO followed by a long sleep which yields inconsistence performance and high error rates during the bursts.

We applied a workaround to check throttle every row which solved our performance and error issues:

line 116 in org.apache.cassandra.db.compaction.CompactionIterable:
if ((row++ % 1000) == 0)
replaced with
if ((row++ % 1) == 0)

I think the better solution is to calculate how often throttle should be checked based on the throttle rate to apply sleeps more consistently. E.g. if 16MB/sec is the limit then check for sleep after every 16MB is read so sleeps are spaced out about every second.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

4316-1.2.txt
30/Apr/13 16:29
24 kB
Jonathan Ellis
4316-1.2.txt
03/Jan/13 19:46
16 kB
Yuki Morishita
4316-1.2-v2.txt
14/Jan/13 19:38
17 kB
Yuki Morishita
4316-v3.txt
27/Apr/13 06:20
22 kB
Jonathan Ellis

Activity

People

Assignee:: Jonathan Ellis

Reporter:: Wayne Lewis

Authors:: Jonathan Ellis

Reviewers:: Yuki Morishita

Tester:: Ryan McGuire

Votes:: 1 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 07/Jun/12 10:57

Updated:: 16/Apr/19 09:32

Resolved:: 30/Apr/13 21:01