[CASSANDRA-12730] Thousands of empty SSTables created during repair - TMOF death - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Urgent
Resolution: Unresolved
Fix Version/s: None
Component/s: Feature/Materialized Views, Legacy/Local Write-Read Paths
Labels:
None

Severity:
Critical

Description

Last night I ran a repair on a keyspace with 7 tables and 4 MVs each containing a few hundret million records. After a few hours a node died because of "too many open files".
Normally one would just raise the limit, but: We already set this to 100k. The problem was that the repair created roughly over 100k SSTables for a certain MV. The strange thing is that these SSTables had almost no data (like 53bytes, 90bytes, ...). Some of them (<5%) had a few 100 KB, very few (<1% had normal sizes like >= few MB). I could understand, that SSTables queue up as they are flushed and not compacted in time but then they should have at least a few MB (depending on config and avail mem), right?
Of course then the node runs out of FDs and I guess it is not a good idea to raise the limit even higher as I expect that this would just create even more empty SSTables before dying at last.

Only 1 CF (MV) was affected. All other CFs (also MVs) behave sanely. Empty SSTables have been created equally over time. 100-150 every minute. Among the empty SSTables there are also Tables that look normal like having few MBs.
I didn't see any errors or exceptions in the logs until TMOF occured. Just tons of streams due to the repair (which I actually run over cs-reaper as subrange, full repairs).
After having restarted that node (and no more repair running), the number of SSTables went down again as they are compacted away slowly.

According to zznate this issue may relate to ~~CASSANDRA-10342~~ + ~~CASSANDRA-8641~~

Attachments

Issue Links

is related to

CASSANDRA-12489 consecutive repairs of same range always finds 'out of sync' in sane cluster

Open

relates to

CASSANDRA-12888 Incremental repairs broken for MVs and CDC

Open

Activity

People

Assignee:: Benjamin Roth

Reporter:: Benjamin Roth

Authors:: Benjamin Roth

Votes:: 0 Vote for this issue

Watchers:: 26 Start watching this issue

Dates

Created:: 29/Sep/16 06:00

Updated:: 16/Apr/19 09:30