Uploaded image for project: 'Accumulo'
  1. Accumulo
  2. ACCUMULO-1355

Accumulo ingestion hangs/fails for 300 MB+ file, local machine memory usage grows to 10GB+

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Cannot Reproduce
    • 1.4.2, 1.4.3
    • None
    • client, rpc, tserver
    • None

    Description

      Bug:
      I am attempting to ingest a 300+ MB file into Accumulo, however the ingestion process hangs and my local machines memory consumption grows to 10GB+.
      The mutation is never put into accumulo.

      The file I am ingesting can be found here (needs to be unzipped): http://www.epa.gov/ttn/atw/nata2005/emissions_mdbzip/2005natav3_ei_ca.zip

      I’ve attached a snippet of code used to ingest the data here: http://pastebin.com/Yh4V6nng

      Initial Investigation:
      While attempting to upload the file, the thread which sends the mutation to the tablet ‘waits’ in the waitRTE() method which is called by synchronized void flush() in TabletServerBatchWriter.java and is never ‘woken up’ by a notify call. While ‘waiting’ my local machines memory allocation grows to 10 GB+.

      Thread Stack Trace (from java profiler):
      java.nio.Bits.copyFromArray(Bits.java:699)
      java.nio.DirectByteBuffer.put(DirectByteBuffer.java:360)
      java.nio.DirectByteBuffer.put(DirectByteBuffer.java:331)
      sun.nio.ch.IOUtil.write(IOUtil.java:35)
      sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:336)
      org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:55)
      org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
      org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:146)
      org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:107)
      java.io.BufferedOutputStream.write(BufferedOutputStream.java:105)
      org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:145)
      org.apache.thrift.transport.TFramedTransport.flush(TFramedTransport.java:157)
      org.apache.accumulo.core.client.impl.ThriftTransportPool$CachedTTransport.flush(ThriftTransportPool.java:299)
      org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.send_applyUpdates(TabletClientService.java:449)
      org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.applyUpdates(TabletClientService.java:436)
      sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      java.lang.reflect.Method.invoke(Method.java:597)
      org.apache.accumulo.cloudtrace.instrument.thrift.TraceWrap$2.invoke(TraceWrap.java:84)
      com.sun.proxy.$Proxy7.applyUpdates(Unknown Source)
      org.apache.accumulo.core.client.impl.TabletServerBatchWriter$MutationWriter.sendMutationsToTabletServer(TabletServerBatchWriter.java:768)
      org.apache.accumulo.core.client.impl.TabletServerBatchWriter$MutationWriter.access$1400(TabletServerBatchWriter.java:536)
      org.apache.accumulo.core.client.impl.TabletServerBatchWriter$MutationWriter$SendTask.send(TabletServerBatchWriter.java:700)
      org.apache.accumulo.core.client.impl.TabletServerBatchWriter$MutationWriter$SendTask.run(TabletServerBatchWriter.java:671)
      java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
      java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
      java.util.concurrent.FutureTask.run(FutureTask.java:138)
      java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
      java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
      java.lang.Thread.run(Thread.java:680)

      Tserver log error:
      2013-04-26 15:36:25,253 [server.TNonblockingServer] ERROR: Unexpected exception while invoking!
      java.lang.RuntimeException: No Such SessionID
      at org.apache.accumulo.server.tabletserver.TabletServer$ThriftClientHandler.applyUpdates(TabletServer.java:1433)
      at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      at java.lang.reflect.Method.invoke(Method.java:601)
      at org.apache.accumulo.cloudtrace.instrument.thrift.TraceWrap$1.invoke(TraceWrap.java:59)
      at com.sun.proxy.$Proxy2.applyUpdates(Unknown Source)
      at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Processor$applyUpdates.process(TabletClientService.java:2315)
      at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Processor.process(TabletClientService.java:2037)
      at org.apache.accumulo.server.util.TServerUtils$TimedProcessor.process(TServerUtils.java:154)
      at org.apache.thrift.server.TNonblockingServer$FrameBuffer.invoke(TNonblockingServer.java:631)
      at org.apache.accumulo.server.util.TServerUtils$THsHaServer$Invocation.run(TServerUtils.java:202)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      at org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
      at java.lang.Thread.run(Thread.java:722)

      Attachments

        Activity

          People

            ctubbsii Christopher Tubbs
            nasheb Nasheb Ismaily
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: