Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Auto Closed
-
1.5.2, 1.6.1
-
None
-
None
-
1.6.0 with the patch for
ACCUMULO-1628applied to it
Description
On a high ingest/query cluster (between 10-20 nodes) I see the following -
2014-11-14 22:08:12,745 [tserver.InMemoryMap] ERROR: Failed to create mem dump file java.io.EOFException at java.io.DataInputStream.readByte(DataInputStream.java:267) at org.apache.accumulo.core.file.rfile.RelativeKey.fastSkip(RelativeKey.java:314) at org.apache.accumulo.core.file.rfile.RFile$LocalityGroupReader._seek(RFile.java:748) at org.apache.accumulo.core.file.rfile.RFile$LocalityGroupReader.seek(RFile.java:607) at org.apache.accumulo.core.iterators.system.LocalityGroupIterator.seek(LocalityGroupIterator.java:142) at org.apache.accumulo.core.file.rfile.RFile$Reader.seek(RFile.java:979) at org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:101) at org.apache.accumulo.tserver.MemKeyConversionIterator.seek(InMemoryMap.java:168) at org.apache.accumulo.core.iterators.system.SourceSwitchingIterator._switchNow(SourceSwitchingIterator.java:171) at org.apache.accumulo.core.iterators.system.SourceSwitchingIterator.switchNow(SourceSwitchingIterator.java:179) at org.apache.accumulo.tserver.InMemoryMap$MemoryIterator.switchNow(InMemoryMap.java:647) at org.apache.accumulo.tserver.InMemoryMap$MemoryIterator.access$900(InMemoryMap.java:601) at org.apache.accumulo.tserver.InMemoryMap.delete(InMemoryMap.java:746) at org.apache.accumulo.tserver.Tablet$TabletMemory.finalizeMinC(Tablet.java:327) at org.apache.accumulo.tserver.Tablet.minorCompact(Tablet.java:2068) at org.apache.accumulo.tserver.Tablet.access$4300(Tablet.java:170) at org.apache.accumulo.tserver.Tablet$MinorCompactionTask.run(Tablet.java:2134) at org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34) at org.apache.accumulo.trace.instrument.TraceRunnable.run(TraceRunnable.java:47) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at org.apache.accumulo.trace.instrument.TraceRunnable.run(TraceRunnable.java:47) at org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34) at java.lang.Thread.run(Thread.java:744)
After this happens, I also see iterators failing-
ERROR: exception when running query java.io.EOFException at java.io.DataInputStream.readByte(DataInputStream.java:267) at org.apache.accumulo.core.file.rfile.RelativeKey.fastSkip(RelativeKey.java:314) at org.apache.accumulo.core.file.rfile.RFile$LocalityGroupReader._seek(RFile.java:748) at org.apache.accumulo.core.file.rfile.RFile$LocalityGroupReader.seek(RFile.java:607) at org.apache.accumulo.core.iterators.system.LocalityGroupIterator.seek(LocalityGroupIterator.java:142) at org.apache.accumulo.core.file.rfile.RFile$Reader.seek(RFile.java:979) at org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:101) at org.apache.accumulo.tserver.MemKeyConversionIterator.seek(InMemoryMap.java:168) at org.apache.accumulo.core.iterators.system.SourceSwitchingIterator.readNext(SourceSwitchingIterator.java:116) at org.apache.accumulo.core.iterators.system.SourceSwitchingIterator.seek(SourceSwitchingIterator.java:162) at org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:101) at org.apache.accumulo.core.iterators.SkippingIterator.seek(SkippingIterator.java:37) at org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:101) at org.apache.accumulo.core.iterators.system.MultiIterator.seek(MultiIterator.java:105) at org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:101) at org.apache.accumulo.core.iterators.system.StatsIterator.seek(StatsIterator.java:64) at org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:101) at org.apache.accumulo.core.iterators.system.DeletingIterator.seek(DeletingIterator.java:67) at org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:101) at org.apache.accumulo.core.iterators.SkippingIterator.seek(SkippingIterator.java:37) at org.apache.accumulo.core.iterators.system.ColumnFamilySkippingIterator.seek(ColumnFamilySkippingIterator.java:123) at org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:101) at org.apache.accumulo.core.iterators.Filter.seek(Filter.java:64) at org.apache.accumulo.core.iterators.WrappingIterator.seek(WrappingIterator.java:101) at org.apache.accumulo.core.iterators.Filter.seek(Filter.java:64) at org.apache.accumulo.core.iterators.system.SynchronizedIterator.seek(SynchronizedIterator.java:55)
Still waiting on logs related, but when the system gets into this state, all tservers are throwing errors like this (our query pattern uses a lot of batchscanners), but also we see the gc take extremely long times to run, throwing
[impl.ThriftScanner] DEBUG: Scan failed, thrift error org.apache.thrift.transport.TTransportException java.net.SocketTimeoutException: 120000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected...
but I think that is unrelated
Attachments
Issue Links
- relates to
-
ACCUMULO-1628 NPE on deep copied dumped memory iterator
- Resolved
-
ACCUMULO-2247 random walk fails when server gets EOFException
- Resolved