Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-10538

Assertion failed in LogFile when disk is full

    XMLWordPrintableJSON

Details

    • Normal

    Description

      carlyeks was running a stress job which filled up the disk. At the end of the system logs there are several assertion errors:

      ERROR [CompactionExecutor:1] 2015-10-14 20:46:55,467 CassandraDaemon.java:195 - Exception in thread Thread[CompactionExecutor:1,1,main]
      java.lang.RuntimeException: Insufficient disk space to write 2097152 bytes
              at org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.getWriteDirectory(CompactionAwareWriter.java:156) ~[main/:na]
              at org.apache.cassandra.db.compaction.writers.MaxSSTableSizeWriter.realAppend(MaxSSTableSizeWriter.java:77) ~[main/:na]
              at org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.append(CompactionAwareWriter.java:110) ~[main/:na]
              at org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:182) ~[main/:na]
              at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) ~[main/:na]
              at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:78) ~[main/:na]
              at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61) ~[main/:na]
              at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:220) ~[main/:na]
              at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_40]
              at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_40]
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[na:1.8.0_40]
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_40]
              at java.lang.Thread.run(Thread.java:745) [na:1.8.0_40]
      INFO  [IndexSummaryManager:1] 2015-10-14 21:10:40,099 IndexSummaryManager.java:257 - Redistributing index summaries
      ERROR [IndexSummaryManager:1] 2015-10-14 21:10:42,275 CassandraDaemon.java:195 - Exception in thread Thread[IndexSummaryManager:1,1,main]
      java.lang.AssertionError: Already completed!
              at org.apache.cassandra.db.lifecycle.LogFile.abort(LogFile.java:221) ~[main/:na]
              at org.apache.cassandra.db.lifecycle.LogTransaction.doAbort(LogTransaction.java:376) ~[main/:na]
              at org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.abort(Transactional.java:144) ~[main/:na]
              at org.apache.cassandra.db.lifecycle.LifecycleTransaction.doAbort(LifecycleTransaction.java:259) ~[main/:na]
              at org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.abort(Transactional.java:144) ~[main/:na]
              at org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.abort(Transactional.java:193) ~[main/:na]
              at org.apache.cassandra.utils.concurrent.Transactional$AbstractTransactional.close(Transactional.java:158) ~[main/:na]
              at org.apache.cassandra.io.sstable.IndexSummaryManager.redistributeSummaries(IndexSummaryManager.java:242) ~[main/:na]
              at org.apache.cassandra.io.sstable.IndexSummaryManager$1.runMayThrow(IndexSummaryManager.java:134) ~[main/:na]
              at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) ~[main/:na]
              at org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolE
      

      We should not have an assertion if it can happen when the disk is full, we should rather have a runtime exception.

      I also would like to understand exactly what triggered the assertion. LifecycleTransaction can throw at the beginning of the commit method if it cannot write the record to disk, in which case all we have to do is ensure we update the records in memory after writing to disk (currently we update them before). However, I am not sure this is what happened here, it looks more like abort was called twice, which should never happen.

      Attachments

        Issue Links

          Activity

            People

              stefania Stefania Alborghetti
              stefania Stefania Alborghetti
              Stefania Alborghetti
              Ariel Weisberg
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: