Uploaded image for project: 'Apache Cassandra'
  1. Apache Cassandra
  2. CASSANDRA-12582

Removing static column results in ReadFailure due to CorruptSSTableException

Agile BoardAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Critical

    Description

      We ran into an issue on production where reads began to fail for certain queries, depending on the range within the relation for those queries. Cassandra system log showed an unhandled CorruptSSTableException exception.

      CQL read failure:

      ReadFailure: code=1300 [Replica(s) failed to execute read] message="Operation failed - received 0 responses and 1 failures" info={'failures': 1, 'received_responses': 0, 'required_responses': 1, 'consistency': 'ONE'}
      

      Cassandra exception:

      WARN  [SharedPool-Worker-2] 2016-08-31 12:49:27,979 AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread Thread[SharedPool-Worker-2,5,main]: {}
      java.lang.RuntimeException: org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: /usr/local/apache-cassandra-3.0.8/data/data/issue309/apples_by_tree-006748a06fa311e6a7f8ef8b642e977b/mb-1-big-Data.db
        at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2453) ~[apache-cassandra-3.0.8.jar:3.0.8]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_72]
        at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164) ~[apache-cassandra-3.0.8.jar:3.0.8]
        at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136) [apache-cassandra-3.0.8.jar:3.0.8]
        at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) [apache-cassandra-3.0.8.jar:3.0.8]
        at java.lang.Thread.run(Thread.java:745) [na:1.8.0_72]
      Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: /usr/local/apache-cassandra-3.0.8/data/data/issue309/apples_by_tree-006748a06fa311e6a7f8ef8b642e977b/mb-1-big-Data.db
        at org.apache.cassandra.io.sstable.format.big.BigTableScanner$KeyScanningIterator$1.initializeIterator(BigTableScanner.java:343) ~[apache-cassandra-3.0.8.jar:3.0.8]
        at org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.maybeInit(LazilyInitializedUnfilteredRowIterator.java:48) ~[apache-cassandra-3.0.8.jar:3.0.8]
        at org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.isReverseOrder(LazilyInitializedUnfilteredRowIterator.java:65) ~[apache-cassandra-3.0.8.jar:3.0.8]
        at org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.isReverseOrder(LazilyInitializedUnfilteredRowIterator.java:66) ~[apache-cassandra-3.0.8.jar:3.0.8]
        at org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:62) ~[apache-cassandra-3.0.8.jar:3.0.8]
        at org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:24) ~[apache-cassandra-3.0.8.jar:3.0.8]
        at org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:96) ~[apache-cassandra-3.0.8.jar:3.0.8]
        at org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:295) ~[apache-cassandra-3.0.8.jar:3.0.8]
        at org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:134) ~[apache-cassandra-3.0.8.jar:3.0.8]
        at org.apache.cassandra.db.ReadResponse$LocalDataResponse.<init>(ReadResponse.java:127) ~[apache-cassandra-3.0.8.jar:3.0.8]
        at org.apache.cassandra.db.ReadResponse$LocalDataResponse.<init>(ReadResponse.java:123) ~[apache-cassandra-3.0.8.jar:3.0.8]
        at org.apache.cassandra.db.ReadResponse.createDataResponse(ReadResponse.java:65) ~[apache-cassandra-3.0.8.jar:3.0.8]
        at org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:289) ~[apache-cassandra-3.0.8.jar:3.0.8]
        at org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1796) ~[apache-cassandra-3.0.8.jar:3.0.8]
        at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2449) ~[apache-cassandra-3.0.8.jar:3.0.8]
        ... 5 common frames omitted
      Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: /usr/local/apache-cassandra-3.0.8/data/data/issue309/apples_by_tree-006748a06fa311e6a7f8ef8b642e977b/mb-1-big-Data.db
        at org.apache.cassandra.db.columniterator.AbstractSSTableIterator.<init>(AbstractSSTableIterator.java:130) ~[apache-cassandra-3.0.8.jar:3.0.8]
        at org.apache.cassandra.db.columniterator.SSTableIterator.<init>(SSTableIterator.java:46) ~[apache-cassandra-3.0.8.jar:3.0.8]
        at org.apache.cassandra.io.sstable.format.big.BigTableReader.iterator(BigTableReader.java:69) ~[apache-cassandra-3.0.8.jar:3.0.8]
        at org.apache.cassandra.io.sstable.format.big.BigTableScanner$KeyScanningIterator$1.initializeIterator(BigTableScanner.java:338) ~[apache-cassandra-3.0.8.jar:3.0.8]
        ... 19 common frames omitted
      Caused by: java.io.IOException: Corrupt (negative) value length encountered
        at org.apache.cassandra.db.marshal.AbstractType.readValue(AbstractType.java:399) ~[apache-cassandra-3.0.8.jar:3.0.8]
        at org.apache.cassandra.db.rows.BufferCell$Serializer.deserialize(BufferCell.java:302) ~[apache-cassandra-3.0.8.jar:3.0.8]
        at org.apache.cassandra.db.rows.UnfilteredSerializer.readSimpleColumn(UnfilteredSerializer.java:462) ~[apache-cassandra-3.0.8.jar:3.0.8]
        at org.apache.cassandra.db.rows.UnfilteredSerializer.deserializeRowBody(UnfilteredSerializer.java:440) ~[apache-cassandra-3.0.8.jar:3.0.8]
        at org.apache.cassandra.db.rows.UnfilteredSerializer.deserializeStaticRow(UnfilteredSerializer.java:381) ~[apache-cassandra-3.0.8.jar:3.0.8]
        at org.apache.cassandra.db.columniterator.AbstractSSTableIterator.readStaticRow(AbstractSSTableIterator.java:179) ~[apache-cassandra-3.0.8.jar:3.0.8]
        at org.apache.cassandra.db.columniterator.AbstractSSTableIterator.<init>(AbstractSSTableIterator.java:103) ~[apache-cassandra-3.0.8.jar:3.0.8]
        ... 22 common frames omitted
      

      After debugging, it appears that a previously dropped static column (weeks prior) was the instigator of the issue. As a workaround we added back the column, restarted all cassandra processes within the cluster, and the read error and corruption exception went away.

      Attached is a script to reproduce with a simple schema.

      Also noteworthy (and shown in the script) is that when in this state, compaction silently failed (exit 0) to remove the dropped static columns from the "corrupted" sstable.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            stefania Stefania Alborghetti Assign to me
            eprothro Evan Prothro
            Stefania Alborghetti
            Carl Yeksigian
            Votes:
            2 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment