Details
-
Bug
-
Status: Resolved
-
Urgent
-
Resolution: Not A Problem
-
None
-
None
-
3 node debian linux cluster
oracle jdk - java version "1.7.0_15"
hadoop 1.0.4
1 node running hadoop job which inserts data
-
Critical
Description
create keyspace & column family via cli:
create keyspace vectorization WITH placement_strategy = 'SimpleStrategy' AND strategy_options = {replication_factor:2};
create column family dict WITH comparator = UTF8Type AND key_validation_class=UTF8Type AND column_metadata = [ {column_name: id, validation_class: LongType} {column_name: df, validation_class: LongType}];
Now I run my hadoop job which gets data from another keyspace (same cluster) and reduces it to dict.
Afterwards I try to get some values via cli:
[default@unknown] use vectorization; [default@vectorization] assume dict keys as ascii; [default@vectorization] get dict['xyz']; => (column=df, value=329305, timestamp=1363715523545000) => (column=id, value=8477047, timestamp=1363715523545000) Returned 2 results. Elapsed time: 38 msec(s). [default@vectorization] get dict['14']; null TimedOutException() at org.apache.cassandra.thrift.Cassandra$get_slice_result.read(Cassandra.java:7874) [...]
and on the server:
ERROR 09:42:46,834 Exception in thread Thread[ReadStage:42281,5,main] java.lang.RuntimeException: org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.EOFException at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1582) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:722) Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.EOFException at org.apache.cassandra.db.columniterator.SimpleSliceReader.computeNext(SimpleSliceReader.java:106) at org.apache.cassandra.db.columniterator.SimpleSliceReader.computeNext(SimpleSliceReader.java:38) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) at org.apache.cassandra.db.columniterator.SSTableSliceIterator.hasNext(SSTableSliceIterator.java:90) at org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:171) at org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154) at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143) at org.apache.cassandra.utils.MergeIterator$ManyToOne.<init>(MergeIterator.java:86) at org.apache.cassandra.utils.MergeIterator.get(MergeIterator.java:45) at org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:134) at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84) at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:294) at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65) at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1363) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1220) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1132) at org.apache.cassandra.db.Table.getRow(Table.java:355) at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70) at org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1052) at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1578) ... 3 more Caused by: java.io.EOFException at java.io.RandomAccessFile.readFully(RandomAccessFile.java:416) at java.io.RandomAccessFile.readFully(RandomAccessFile.java:394) at org.apache.cassandra.io.util.RandomAccessReader.readBytes(RandomAccessReader.java:380) at org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:392) at org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:355) at org.apache.cassandra.db.ColumnSerializer.deserializeColumnBody(ColumnSerializer.java:108) at org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:92) at org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:73) at org.apache.cassandra.db.columniterator.SimpleSliceReader.computeNext(SimpleSliceReader.java:102) ... 23 more
Which keys work, and which don't seems to be completely random, and differ on each retry (drop cf, create new cf, rerun hadoop job).