[CASSANDRA-10538] Assertion failed in LogFile when disk is full - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Normal
Resolution: Fixed
Fix Version/s: 3.0.1, 3.1
Component/s: Legacy/Local Write-Read Paths
Labels:
None

Severity:
Normal

Description

carlyeks was running a stress job which filled up the disk. At the end of the system logs there are several assertion errors:

ERROR [CompactionExecutor:1] 2015-10-14 20:46:55,467 CassandraDaemon.java:195 - Exception in thread Thread[CompactionExecutor:1,1,main]
java.lang.RuntimeException: Insufficient disk space to write 2097152 bytes
        at org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.getWriteDirectory(CompactionAwareWriter.java:156) ~[main/:na]
        at org.apache.cassandra.db.compaction.writers.MaxSSTableSizeWriter.realAppend(MaxSSTableSizeWriter.java:77) ~[main/:na]
        at org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.append(CompactionAwareWriter.java:110) ~[main/:na]
        at org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:182) ~[main/:na]
        at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) ~[main/:na]
        at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:78) ~[main/:na]
        at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61) ~[main/:na]
        at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:220) ~[main/:na]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_40]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_40]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[na:1.8.0_40]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_40]
        at java.lang.Thread.run(Thread.java:745) [na:1.8.0_40]
INFO  [IndexSummaryManager:1] 2015-10-14 21:10:40,099 IndexSummaryManager.java:257 - Redistributing index summaries
ERROR [IndexSummaryManager:1] 2015-10-14 21:10:42,275 CassandraDaemon.java:195 - Exception in thread Thread[IndexSummaryManager:1,1,main]
java.lang.AssertionError: Already completed!
        at org.apache.cassandra.db.lifecycle.LogFile.abort(LogFile.java:221) ~[main/:na]
        at org.apache.cassandra.db.lifecycle.LogTransaction.doAbort(LogTransaction.java:376) ~[main/:na]
        at org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.abort(Transactional.java:144) ~[main/:na]
        at org.apache.cassandra.db.lifecycle.LifecycleTransaction.doAbort(LifecycleTransaction.java:259) ~[main/:na]
        at org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.abort(Transactional.java:144) ~[main/:na]
        at org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.abort(Transactional.java:193) ~[main/:na]
        at org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.close(Transactional.java:158) ~[main/:na]
        at org.apache.cassandra.io.sstable.IndexSummaryManager.redistributeSummaries(IndexSummaryManager.java:242) ~[main/:na]
        at org.apache.cassandra.io.sstable.IndexSummaryManager$1.runMayThrow(IndexSummaryManager.java:134) ~[main/:na]
        at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) ~[main/:na]
        at org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolE

We should not have an assertion if it can happen when the disk is full, we should rather have a runtime exception.

I also would like to understand exactly what triggered the assertion. LifecycleTransaction can throw at the beginning of the commit method if it cannot write the record to disk, in which case all we have to do is ensure we update the records in memory after writing to disk (currently we update them before). However, I am not sure this is what happened here, it looks more like abort was called twice, which should never happen.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

ma_txn_compaction_67311da0-72b4-11e5-9eb9-b14fa4bbe709.log
16/Oct/15 03:10
2 kB
Stefania Alborghetti
ma_txn_compaction_696059b0-72b4-11e5-9eb9-b14fa4bbe709.log
16/Oct/15 03:10
3 kB
Stefania Alborghetti
ma_txn_compaction_8ac58b70-72b4-11e5-9eb9-b14fa4bbe709.log
16/Oct/15 03:10
2 kB
Stefania Alborghetti
ma_txn_compaction_8be24610-72b4-11e5-9eb9-b14fa4bbe709.log
16/Oct/15 03:10
1.0 kB
Stefania Alborghetti
ma_txn_compaction_95500fc0-72b4-11e5-9eb9-b14fa4bbe709.log
16/Oct/15 03:10
0.6 kB
Stefania Alborghetti
ma_txn_compaction_a41caa90-72b4-11e5-9eb9-b14fa4bbe709.log
16/Oct/15 03:10
0.3 kB
Stefania Alborghetti

Issue Links

links to

3.0 patch

Activity

People

Assignee:: Stefania Alborghetti

Reporter:: Stefania Alborghetti

Authors:: Stefania Alborghetti

Reviewers:: Ariel Weisberg

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 16/Oct/15 03:10

Updated:: 16/Apr/19 09:30

Resolved:: 13/Nov/15 15:04