Accumulo
  1. Accumulo
  2. ACCUMULO-2495

OOM exception didn't bring down tserver

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 1.5.1
    • Fix Version/s: None
    • Component/s: tserver
    • Labels:
      None

      Description

      Got

      Thread "acu-problem-reporter 2" died Direct buffer memory
      	java.lang.OutOfMemoryError: Direct buffer memory
      		at java.nio.Bits.reserveMemory(Bits.java:659)
      		at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:113)
      		at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:305)
      		at sun.nio.ch.Util.getTemporaryDirectBuffer(Util.java:75)
      		at sun.nio.ch.IOUtil.read(IOUtil.java:223)
      		at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:254)
      		at org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:55)
      		at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
      		at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
      		at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
      		at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
      		at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
      		at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
      		at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
      		at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
      		at org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129)
      		at org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)
      		at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
      		at org.apache.accumulo.core.client.impl.ThriftTransportPool$CachedTTransport.readAll(ThriftTransportPool.java:271)
      		at org.apache.thrift.protocol.TCompactProtocol.readByte(TCompactProtocol.java:601)
      		at org.apache.thrift.protocol.TCompactProtocol.readMessageBegin(TCompactProtocol.java:470)
      		at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
      		at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.recv_update(TabletClientService.java:443)
      		at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.update(TabletClientService.java:427)
      		at org.apache.accumulo.core.client.impl.Writer.updateServer(Writer.java:69)
      		at org.apache.accumulo.core.client.impl.Writer.update(Writer.java:97)
      		at org.apache.accumulo.server.problems.ProblemReport.saveToMetadataTable(ProblemReport.java:134)
      		at org.apache.accumulo.server.problems.ProblemReports$1.run(ProblemReports.java:92)
      		at org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
      		at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
      		at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      		at org.apache.accumulo.trace.instrument.TraceRunnable.run(TraceRunnable.java:47)
      		at org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
      		at java.lang.Thread.run(Thread.java:701)

      while hammering a single node setup with table creates and delete. First the master went down with an OOM after about an hour, which is strange since I gave it a gig and was only creating and dropping tables in 64 count chunks. When I brought the master back up, I saw that stack trace in the monitor, but nothing in the tserver logs.

      Initial logging was

      2014-03-18 16:59:43,977 [impl.TabletServerBatchWriter] ERROR: Failed to send tablet server 127.0.0.1:9997 its batch : Direct buffer memory
      java.lang.OutOfMemoryError: Direct buffer memory
              at java.nio.Bits.reserveMemory(Bits.java:659)
              at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:113)
              at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:305)
              at sun.nio.ch.Util.getTemporaryDirectBuffer(Util.java:75)
              at sun.nio.ch.IOUtil.read(IOUtil.java:223)
              at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:254)
              at org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:55)
              at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
              at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
              at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
              at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
              at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
              at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
              at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
              at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
              at org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129)
              at org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)
              at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
              at org.apache.accumulo.core.client.impl.ThriftTransportPool$CachedTTransport.readAll(ThriftTransportPool.java:271)
              at org.apache.thrift.protocol.TCompactProtocol.readByte(TCompactProtocol.java:601)
              at org.apache.thrift.protocol.TCompactProtocol.readMessageBegin(TCompactProtocol.java:470)
              at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
              at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.recv_update(TabletClientService.java:443)
              at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.update(TabletClientService.java:427)
              at org.apache.accumulo.core.client.impl.TabletServerBatchWriter$MutationWriter.sendMutationsToTabletServer(TabletServerBatchWriter.java:870)
              at org.apache.accumulo.core.client.impl.TabletServerBatchWriter$MutationWriter.access$1(TabletServerBatchWriter.java:845)
              at org.apache.accumulo.core.client.impl.TabletServerBatchWriter$MutationWriter$SendTask.send(TabletServerBatchWriter.java:803)
              at org.apache.accumulo.core.client.impl.TabletServerBatchWriter$MutationWriter$SendTask.run(TabletServerBatchWriter.java:767)
              at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
              at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
              at java.util.concurrent.FutureTask.run(FutureTask.java:166)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
              at org.apache.accumulo.trace.instrument.TraceRunnable.run(TraceRunnable.java:47)
              at org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
              at java.lang.Thread.run(Thread.java:701)
      

      . I'm using the default -XX:OnOutOfMemoryError=kill -9 %p, so I don't know why this is still living. This seems problematic though.

      1. Test.java
        1 kB
        Keith Turner

        Issue Links

          Activity

          No work has yet been logged on this issue.

            People

            • Assignee:
              Unassigned
              Reporter:
              John Vines
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:

                Development