Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-15700

BytesColumnVector can get stuck trying to resize byte buffer

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 2.2.0
    • Vectorization
    • None

    Description

      While looking at HIVE-15698, hit an issue where one of the reducers was stuck in the following stack trace:

      Thread 12735: (state = IN_JAVA)
       - org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.increaseBufferSpace(int) @bci=22, line=245 (Compiled frame; information may be imprecise)
       - org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.setVal(int, byte[], int, int) @bci=18, line=150 (Interpreted frame)
       - org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.storeRowColumn(org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch, int, int, boolean) @bci=536, line=442 (Compiled frame)
       - org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.deserialize(org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch, int) @bci=110, line=761 (Interpreted frame)
       - org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectorGroup(org.apache.hadoop.io.BytesWritable, java.lang.Iterable, byte) @bci=184, line=444 (Interpreted frame)
       - org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecordVector() @bci=119, line=388 (Interpreted frame)
       - org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord() @bci=8, line=239 (Interpreted frame)
       - org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run() @bci=124, line=319 (Interpreted frame)
       - org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(java.util.Map, java.util.Map) @bci=30, line=185 (Interpreted frame)
       - org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(java.util.Map, java.util.Map) @bci=159, line=168 (Interpreted frame)
       - org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run() @bci=65, line=370 (Interpreted frame)
       - org.apache.tez.runtime.task.TaskRunner2Callable$1.run() @bci=133, line=73 (Interpreted frame)
       - org.apache.tez.runtime.task.TaskRunner2Callable$1.run() @bci=1, line=61 (Interpreted frame)
       - java.security.AccessController.doPrivileged(java.security.PrivilegedExceptionAction, java.security.AccessControlContext) @bci=0 (Compiled frame)
       - javax.security.auth.Subject.doAs(javax.security.auth.Subject, java.security.PrivilegedExceptionAction) @bci=42, line=422 (Interpreted frame)
       - org.apache.hadoop.security.UserGroupInformation.doAs(java.security.PrivilegedExceptionAction) @bci=14, line=1724 (Interpreted frame)
       - org.apache.tez.runtime.task.TaskRunner2Callable.callInternal() @bci=38, line=61 (Interpreted frame)
       - org.apache.tez.runtime.task.TaskRunner2Callable.callInternal() @bci=1, line=37 (Interpreted frame)
       - org.apache.tez.common.CallableWithNdc.call() @bci=8, line=36 (Interpreted frame)
       - java.util.concurrent.FutureTask.run() @bci=42, line=266 (Interpreted frame)
       - java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) @bci=95, line=1142 (Interpreted frame)
       - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=617 (Interpreted frame)
       - java.lang.Thread.run() @bci=11, line=745 (Interpreted frame)
      

      The reducer's input was 167 9MB binary values coming from the previous map job. Per gopalv the BytesColumnVector is stuck trying to reallocate/copy all of these values into the same memory buffer.

      Attachments

        1. HIVE-15700.4.patch
          10 kB
          Jason Dere
        2. HIVE-15700.3.patch
          10 kB
          Jason Dere
        3. HIVE-15700.2.patch
          9 kB
          Jason Dere
        4. HIVE-15700.1.patch
          2 kB
          Jason Dere

        Issue Links

          Activity

            People

              jdere Jason Dere
              jdere Jason Dere
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: