Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-11750

Offline scrub should not abort when it hits corruption

Agile BoardAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Low

    Description

      Hit a failure on startup due to corruption of some sstables in system keyspace. Deleted the listed file and restarted - came down again with another file.

      Figured that I may as well run scrub to clean up all the files. Got following error:

      sstablescrub system compaction_history 
      ERROR 17:21:34 Exiting forcefully due to file system exception on startup, disk failure policy "stop" 
      org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: /cassandra/data/system/compaction_history-b4dbb7b4dc493fb5b3bfce6e434832ca/system-compaction_history-ka-1936-CompressionInfo.db 
      at org.apache.cassandra.io.compress.CompressionMetadata.<init>(CompressionMetadata.java:131) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] 
      at org.apache.cassandra.io.compress.CompressionMetadata.create(CompressionMetadata.java:85) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] 
      at org.apache.cassandra.io.util.CompressedSegmentedFile$Builder.metadata(CompressedSegmentedFile.java:79) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] 
      at org.apache.cassandra.io.util.CompressedPoolingSegmentedFile$Builder.complete(CompressedPoolingSegmentedFile.java:72) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] 
      at org.apache.cassandra.io.util.SegmentedFile$Builder.complete(SegmentedFile.java:169) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] 
      at org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:741) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] 
      at org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:692) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] 
      at org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:480) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
      at org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:376) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046]
      at org.apache.cassandra.io.sstable.SSTableReader$4.run(SSTableReader.java:523) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] 
      at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_79] 
      at java.util.concurrent.FutureTask.run(FutureTask.java:262) [na:1.7.0_79] 
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_79] 
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_79] 
      at java.lang.Thread.run(Thread.java:745) [na:1.7.0_79] 
      Caused by: java.io.EOFException: null 
      at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:340) ~[na:1.7.0_79] 
      at java.io.DataInputStream.readUTF(DataInputStream.java:589) ~[na:1.7.0_79] 
      at java.io.DataInputStream.readUTF(DataInputStream.java:564) ~[na:1.7.0_79] 
      at org.apache.cassandra.io.compress.CompressionMetadata.<init>(CompressionMetadata.java:106) ~[cassandra-all-2.1.12.1046.jar:2.1.12.1046] 
      ... 14 common frames omitted 
      

      I guess it might be by design - but I'd argue that I should at least have the option to continue and let it do it's thing. I'd prefer that sstablescrub ignored the disk failure policy.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            yukim Yuki Morishita Assign to me
            ahattrell Adam Hattrell
            Yuki Morishita
            Paulo Motta (Deprecated)
            Votes:
            1 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment