Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4223

Avro scanner can crash when HDFS seek fails

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: Impala 2.6.0, Impala 2.7.0, Impala 2.8.0
    • Fix Version/s: Impala 2.8.0
    • Component/s: Backend
    • Labels:

      Description

      We saw a crash with this stacktrace:

      (gdb) bt
      #0  0x00007f048d146625 in raise () from sysroot/lib64/libc.so.6
      #1  0x00007f048d147d8d in abort () from sysroot/lib64/libc.so.6
      #2  0x00007f048f2f8a55 in os::abort(bool) ()
         from sysroot/usr/java/jdk1.7.0_67/jre/lib/amd64/server/libjvm.so
      #3  0x00007f048f478f87 in VMError::report_and_die() ()
         from sysroot/usr/java/jdk1.7.0_67/jre/lib/amd64/server/libjvm.so
      #4  0x00007f048f2fd96f in JVM_handle_linux_signal ()
         from sysroot/usr/java/jdk1.7.0_67/jre/lib/amd64/server/libjvm.so
      #5  <signal handler called>
      #6  0x00007f048d19da44 in memcpy () from sysroot/lib64/libc.so.6
      #7  0x0000000000c02306 in impala::StringBuffer::Append (this=0x7ef9161af200, 
          str=0x7ef89dc79ff1 "\265?\236\033\234\255\016\201\216\332\ab\004H\264", len=len@entry=-16689)
          at /usr/src/debug/impala-2.6.0-cdh5.8.0/be/src/runtime/string-buffer.h:51
      #8  0x0000000000c5ab21 in Append (len=-16689, str=<optimized out>, this=<optimized out>)
          at /usr/src/debug/impala-2.6.0-cdh5.8.0/be/src/runtime/string-buffer.h:58
      #9  impala::ScannerContext::Stream::GetBytesInternal (this=0x7f024a876160, 
          requested_len=<optimized out>, out_buffer=0x7efcbb61d250, peek=<optimized out>, 
          out_len=0x7ef89dc79ff1) at /usr/src/debug/impala-2.6.0-cdh5.8.0/be/src/exec/scanner-context.cc:270
      #10 0x0000000000c02588 in impala::ScannerContext::Stream::GetBytes (this=<optimized out>, 
          requested_len=<optimized out>, buffer=<optimized out>, out_len=<optimized out>, 
          status=0x7ef9161af200, peek=<optimized out>)
          at /usr/src/debug/impala-2.6.0-cdh5.8.0/be/src/exec/scanner-context.inline.h:52
      #11 0x000000003c293b80 in ?? ()
      #12 0x00007efcbb61d320 in ?? ()
      #13 0x000000003c293b80 in ?? ()
      #14 0x00007efcbb61d320 in ?? ()
      #15 0x0000000000c7423e in impala::BaseSequenceScanner::SkipToSync (this=0x7ef9161af200, 
          sync=0x7f024a876160 "`\325a\273\374~", sync_size=-1151217072)
          at /usr/src/debug/impala-2.6.0-cdh5.8.0/be/src/exec/base-sequence-scanner.cc:247
      #16 0x000000003c293c68 in ?? ()
      #17 0x00007efcbb61d460 in ?? ()
      #18 0x000000000000000f in ?? ()
      #19 0x000000000000001e in ?? ()
      #20 0x0000000000000000 in ?? ()
      

      It was caused by a HDFS seek error:

      INFO:

      hdfsSeek(desiredPos=134217728): FSDataInputStream#seek error:
      java.io.EOFException: Cannot seek after EOF
              at org.apache.hadoop.hdfs.DFSInputStream.seek(DFSInputStream.java:1548)
              at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:62) 
      

      ERROR:

      I0924 23:28:22.073464 58687 disk-io-mgr-scan-range.cc:326] Cache HDFS file handle file=hdfs://hdfs/path/000000_0
      I0924 23:28:23.184056 58701 status.cc:111] Error seeking to 134217728 in file: hdfs://hdfs/path/000000_0 
      Error(255): Unknown error 255
          @           0x80dea9  (unknown)
          @           0xa100eb  (unknown)
          @           0xa0b218  (unknown)
          @           0xa0b6f4  (unknown)
          @           0xb869a7  (unknown)
          @           0xb872e4  (unknown)
          @           0xde659a  (unknown)
          @     0x7f048e2aaa51  start_thread
          @     0x7f048d1fc93d  clone
      I0924 23:28:23.184286   425 runtime-state.cc:209] Error from query 427b036aafba2e:5283d8ace7e29697: Problem parsing file hdfs://hdfs/path/000000_0 at 134201009
      I0924 23:28:23.184314   425 runtime-state.cc:209] Error from query 427b036aafba2e:5283d8ace7e29697: Error seeking to 134217728 in file: hdfs://nameservice1/hdfs/path/000000_0 
      Error(255): Unknown error 255
      

      It looks like the BaseSequenceScanner hits this error then tries to resume reading from the next sync marker. This doesn't work since the ScannerContext::Stream is in a bad state after the error.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                lv Lars Volker
                Reporter:
                tarmstrong Tim Armstrong
              • Votes:
                0 Vote for this issue
                Watchers:
                8 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: