Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-5365

org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.EOFException

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Urgent
    • Resolution: Not A Problem
    • 1.2.3
    • None
    • None
    • 3 node debian linux cluster
      oracle jdk - java version "1.7.0_15"
      hadoop 1.0.4
      1 node running hadoop job which inserts data

    • Critical

    Description

      create keyspace & column family via cli:

      create keyspace vectorization WITH placement_strategy = 'SimpleStrategy' AND strategy_options = {replication_factor:2};
      create column family dict WITH comparator = UTF8Type AND key_validation_class=UTF8Type AND column_metadata = [ {column_name: id, validation_class: LongType} {column_name: df, validation_class: LongType}];
      

      Now I run my hadoop job which gets data from another keyspace (same cluster) and reduces it to dict.

      Afterwards I try to get some values via cli:

      [default@unknown] use vectorization;
      [default@vectorization] assume dict keys as ascii;
      [default@vectorization] get dict['xyz'];
      => (column=df, value=329305, timestamp=1363715523545000)
      => (column=id, value=8477047, timestamp=1363715523545000)
      Returned 2 results.
      Elapsed time: 38 msec(s).
      [default@vectorization] get dict['14'];
      null
      TimedOutException()
              at org.apache.cassandra.thrift.Cassandra$get_slice_result.read(Cassandra.java:7874)
      [...]
      

      and on the server:

      ERROR 09:42:46,834 Exception in thread Thread[ReadStage:42281,5,main]
      java.lang.RuntimeException: org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.EOFException
              at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1582)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
              at java.lang.Thread.run(Thread.java:722)
      Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.EOFException
              at org.apache.cassandra.db.columniterator.SimpleSliceReader.computeNext(SimpleSliceReader.java:106)
              at org.apache.cassandra.db.columniterator.SimpleSliceReader.computeNext(SimpleSliceReader.java:38)
              at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
              at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
              at org.apache.cassandra.db.columniterator.SSTableSliceIterator.hasNext(SSTableSliceIterator.java:90)
              at org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:171)
              at org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
              at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
              at org.apache.cassandra.utils.MergeIterator$ManyToOne.<init>(MergeIterator.java:86)
              at org.apache.cassandra.utils.MergeIterator.get(MergeIterator.java:45)
              at org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:134)
              at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
              at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:294)
              at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
              at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1363)
              at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1220)
              at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1132)
              at org.apache.cassandra.db.Table.getRow(Table.java:355)
              at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
              at org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1052)
              at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1578)
              ... 3 more
      Caused by: java.io.EOFException
              at java.io.RandomAccessFile.readFully(RandomAccessFile.java:416)
              at java.io.RandomAccessFile.readFully(RandomAccessFile.java:394)
              at org.apache.cassandra.io.util.RandomAccessReader.readBytes(RandomAccessReader.java:380)
              at org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:392)
              at org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:355)
              at org.apache.cassandra.db.ColumnSerializer.deserializeColumnBody(ColumnSerializer.java:108)
              at org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:92)
              at org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:73)
              at org.apache.cassandra.db.columniterator.SimpleSliceReader.computeNext(SimpleSliceReader.java:102)
              ... 23 more
      

      Which keys work, and which don't seems to be completely random, and differ on each retry (drop cf, create new cf, rerun hadoop job).

      Attachments

        Activity

          People

            Unassigned Unassigned
            rherget Roland von Herget
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: