Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-3962

stress test core dump on two nodes with strange, possibly corrupt stack

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: Impala 2.7.0
    • Fix Version/s: Impala 2.7.0
    • Component/s: Backend
    • Labels:

      Description

      Two of the nodes in the stress test cluster dumped core on last night's run (debug build). The stacks are similar and look strange to me. There seems to be corruption or missing symbols past my ability to triage.

      This was the run: http://sandbox.jenkins.cloudera.com/job/Impala-Stress-Test-Physical/660/

      Core dumps occurred on these hosts: vc0706.halxg.cloudera.com, vc0714.halxg.cloudera.com .

      The stacks are similar and resemble this:

      (gdb) bt
      #0  0x000000323f632625 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
      #1  0x000000323f633e05 in abort () at abort.c:92
      #2  0x00007f6e131c9c55 in os::abort(bool) () from /opt/toolchain/sun-jdk-64bit-1.7.0.75/jre/lib/amd64/server/libjvm.so
      #3  0x00007f6e1334bcd7 in VMError::report_and_die() () from /opt/toolchain/sun-jdk-64bit-1.7.0.75/jre/lib/amd64/server/libjvm.so
      #4  0x00007f6e131ceb6f in JVM_handle_linux_signal () from /opt/toolchain/sun-jdk-64bit-1.7.0.75/jre/lib/amd64/server/libjvm.so
      #5  <signal handler called>
      #6  __strncmp_sse42 () at ../sysdeps/x86_64/multiarch/strcmp.S:1740
      #7  0x00007f6db3d2266e in impala::InPredicate::InSetLookupWrapper ()
      #8  0x00000000944d7fa0 in ?? ()
      #9  0x00007f6a742d3240 in ?? ()
      #10 0x0000000700000000 in ?? ()
      #11 0x00007f68688e9840 in ?? ()
      #12 0x000000024a0d6a3c in ?? ()
      #13 0x0000000078b96c48 in ?? ()
      #14 0x0000000001716471 in impala::ParquetPlainEncoder::Decode<impala::DecimalValue<long> > (buffer=Cannot access memory at address 0xfffffffffffffff8
      )
          at /usr/src/debug/impala-2.6.0-cdh5.9.0-SNAPSHOT/be/src/exec/parquet-common.h:328
      Backtrace stopped: previous frame inner to this frame (corrupt stack?)
      (gdb)
      

      It was difficult to find a smoking gun in the logs, so I haven't pasted anything from there.

      The cores and logs are in impala-desktop under the dev user in a directory with this bug ID.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                kwho Michael Ho
                Reporter:
                mikeb Michael Brown
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: