Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-11750

Offline scrub should not abort when it hits corruption

    XMLWordPrintableJSON

Details

    • Low

    Description

      Hit a failure on startup due to corruption of some sstables in system keyspace. Deleted the listed file and restarted - came down again with another file.

      Figured that I may as well run scrub to clean up all the files. Got following error:

      sstablescrub system compaction_history 
      ERROR 17:21:34 Exiting forcefully due to file system exception on startup, disk failure policy "stop" 
      org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: /cassandra/data/system/compaction_history-b4dbb7b4dc493fb5b3bfce6e434832ca/system-compaction_history-ka-1936-CompressionInfo.db 
      at org.apache.cassandra.io.compress.CompressionMetadata.<init>(CompressionMetadata.java:131) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] 
      at org.apache.cassandra.io.compress.CompressionMetadata.create(CompressionMetadata.java:85) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] 
      at org.apache.cassandra.io.util.CompressedSegmentedFile$Builder.metadata(CompressedSegmentedFile.java:79) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] 
      at org.apache.cassandra.io.util.CompressedPoolingSegmentedFile$Builder.complete(CompressedPoolingSegmentedFile.java:72) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] 
      at org.apache.cassandra.io.util.SegmentedFile$Builder.complete(SegmentedFile.java:169) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] 
      at org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:741) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] 
      at org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:692) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] 
      at org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:480) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
      at org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:376) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
      at org.apache.cassandra.io.sstable.SSTableReader$4.run(SSTableReader.java:523) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] 
      at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_79] 
      at java.util.concurrent.FutureTask.run(FutureTask.java:262) [na:1.7.0_79] 
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_79] 
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_79] 
      at java.lang.Thread.run(Thread.java:745) [na:1.7.0_79] 
      Caused by: java.io.EOFException: null 
      at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:340) ~[na:1.7.0_79] 
      at java.io.DataInputStream.readUTF(DataInputStream.java:589) ~[na:1.7.0_79] 
      at java.io.DataInputStream.readUTF(DataInputStream.java:564) ~[na:1.7.0_79] 
      at org.apache.cassandra.io.compress.CompressionMetadata.<init>(CompressionMetadata.java:106) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] 
      ... 14 common frames omitted 
      

      I guess it might be by design - but I'd argue that I should at least have the option to continue and let it do it's thing. I'd prefer that sstablescrub ignored the disk failure policy.

      Attachments

        Activity

          People

            yukim Yuki Morishita
            ahattrell Adam Hattrell
            Yuki Morishita
            Paulo Motta
            Votes:
            1 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: