Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4550

Mismatching types in JOIN crash Impala

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: Impala 2.2, Impala 2.3.0, Impala 2.5.0, Impala 2.4.0, Impala 2.6.0, Impala 2.7.0, Impala 2.8.0
    • Fix Version/s: Impala 2.8.0
    • Component/s: Frontend
    • Labels:

      Description

      The following query crashes Impala. Note the type mismatch of the join columns. Putting b.timestamp_col into the cast prevents the crash.

      SELECT a.id
      FROM   functional.alltypes a
      JOIN   functional.alltypes b on a.string_col = b.timestamp_col
      WHERE  (Cast(a.string_col as String) > 'a');
      

      With codegen enabled the callstack looks like this with impalad hitting an assert in llvm during fragment preparation:

      #0  0x00007f7ac4030cc9 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
      56      ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
      (gdb) where
      #0  0x00007f7ac4030cc9 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
      #1  0x00007f7ac40340d8 in __GI_abort () at abort.c:89
      #2  0x00007f7ac4029b86 in __assert_fail_base (fmt=0x7f7ac417a830 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n",
          assertion=assertion@entry=0x17a6700 "(i >= FTy->getNumParams() || FTy->getParamType(i) == Args[i]->getType()) && \"Calling a function with a bad signature!\"",
          file=file@entry=0x17a6320 "/data/jenkins/workspace/verify-impala-toolchain-package-build/label/ec2-package-ubuntu-14-04/toolchain/source/llvm/llvm-3.8.0-asserts.src-p1/lib/IR/Instructions.cpp",
          line=line@entry=245,
          function=function@entry=0x20676c0 <llvm::CallInst::init(llvm::FunctionType*, llvm::Value*, llvm::ArrayRef<llvm::Value*>, llvm::ArrayRef<llvm::OperandBundleDefT<llvm::Value*> >, llvm::Twine const&)::__PRETTY_FUNCTION__> "void llvm::CallInst::init(llvm::FunctionType*, llvm::Value*, llvm::ArrayRef<llvm::Value*>, llvm::ArrayRef<llvm::OperandBundleDefT<llvm::Value*> >, const llvm::Twine&)") at assert.c:92
      #3  0x00007f7ac4029c32 in __GI___assert_fail (assertion=0x17a6700 "(i >= FTy->getNumParams() || FTy->getParamType(i) == Args[i]->getType()) && \"Calling a function with a bad signature!\"",
          file=0x17a6320 "/data/jenkins/workspace/verify-impala-toolchain-package-build/label/ec2-package-ubuntu-14-04/toolchain/source/llvm/llvm-3.8.0-asserts.src-p1/lib/IR/Instructions.cpp", line=245,
          function=0x20676c0 <llvm::CallInst::init(llvm::FunctionType*, llvm::Value*, llvm::ArrayRef<llvm::Value*>, llvm::ArrayRef<llvm::OperandBundleDefT<llvm::Value*> >, llvm::Twine const&)::__PRETTY_FUNCTION__> "void llvm::CallInst::init(llvm::FunctionType*, llvm::Value*, llvm::ArrayRef<llvm::Value*>, llvm::ArrayRef<llvm::OperandBundleDefT<llvm::Value*> >, const llvm::Twine&)") at assert.c:101
      #4  0x00000000014fab1c in llvm::CallInst::init(llvm::FunctionType*, llvm::Value*, llvm::ArrayRef<llvm::Value*>, llvm::ArrayRef<llvm::OperandBundleDefT<llvm::Value*> >, llvm::Twine const&) ()
      #5  0x00007f7ac92919c9 in llvm::CallInst::CallInst (this=0xb08cb60, Ty=0xa4c7f80, Func=0xb0b7fe8, Args=..., Bundles=..., NameStr=..., InsertBefore=0x0)
          at /opt/Impala-Toolchain/llvm-3.8.0-asserts-p1/include/llvm/IR/Instructions.h:1848
      #6  0x00007f7ac9291860 in llvm::CallInst::Create (Ty=0xa4c7f80, Func=0xb0b7fe8, Args=..., Bundles=..., NameStr=..., InsertBefore=0x0)
          at /opt/Impala-Toolchain/llvm-3.8.0-asserts-p1/include/llvm/IR/Instructions.h:1442
      #7  0x00007f7ac9299184 in llvm::IRBuilder<true, llvm::ConstantFolder, llvm::IRBuilderDefaultInserter<true> >::CreateCall (this=0x7f7a39496240, FTy=0xa4c7f80, Callee=0xb0b7fe8, Args=..., Name=...,
          FPMathTag=0x0) at /opt/Impala-Toolchain/llvm-3.8.0-asserts-p1/include/llvm/IR/IRBuilder.h:1551
      #8  0x00007f7ac9295b7f in llvm::IRBuilder<true, llvm::ConstantFolder, llvm::IRBuilderDefaultInserter<true> >::CreateCall (this=0x7f7a39496240, Callee=0xb0b7fe8, Args=..., Name=..., FPMathTag=0x0)
          at /opt/Impala-Toolchain/llvm-3.8.0-asserts-p1/include/llvm/IR/IRBuilder.h:1568
      #9  0x00007f7ac8adfc9d in impala::CodegenAnyVal::CreateCall (cg=0xa702000, builder=0x7f7a39496240, fn=0xb0b7fe8, args=..., name=0x7f7ac89ab4c7 "result", result_ptr=0x0)
          at /home/lv/i1/be/src/codegen/codegen-anyval.cc:147
      #10 0x00007f7ac894460e in impala::ScalarFnCall::GetCodegendComputeFn (this=0x9fd3c00, codegen=0xa702000, fn=0x7f7a39496878) at /home/lv/i1/be/src/exprs/scalar-fn-call.cc:410
      #11 0x00007f7ac9b900cb in impala::RuntimeState::CodegenScalarFns (this=0x4f54000) at /home/lv/i1/be/src/runtime/runtime-state.cc:197
      #12 0x00007f7ac9b6b93a in impala::PlanFragmentExecutor::PrepareInternal (this=0xa1b1da0, request=...) at /home/lv/i1/be/src/runtime/plan-fragment-executor.cc:248
      #13 0x00007f7ac9b697dc in impala::PlanFragmentExecutor::Prepare (this=0xa1b1da0, request=...) at /home/lv/i1/be/src/runtime/plan-fragment-executor.cc:92
      #14 0x00007f7ac832fd36 in impala::FragmentMgr::FragmentExecState::Prepare (this=0xa1b1a00) at /home/lv/i1/be/src/service/fragment-exec-state.cc:50
      #15 0x00007f7ac832fdea in impala::FragmentMgr::FragmentExecState::Exec (this=0xa1b1a00) at /home/lv/i1/be/src/service/fragment-exec-state.cc:57
      

      With codegen disabled the callstack looks like this, with impalad hitting a segmentation fault during fragment execution:

      #0  0x00007f8714cf0cc9 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
      #1  0x00007f8714cf40d8 in __GI_abort () at abort.c:89
      #2  0x00007f8716f4dc55 in os::abort(bool) () from /usr/lib/jvm/java-7-oracle-amd64/jre/lib/amd64/server/libjvm.so
      #3  0x00007f87170cfcd7 in VMError::report_and_die() () from /usr/lib/jvm/java-7-oracle-amd64/jre/lib/amd64/server/libjvm.so
      #4  0x00007f8716f52b6f in JVM_handle_linux_signal () from /usr/lib/jvm/java-7-oracle-amd64/jre/lib/amd64/server/libjvm.so
      #5  <signal handler called>
      #6  __strncmp_ssse3 () at ../sysdeps/x86_64/multiarch/../strcmp.S:2272
      #7  0x00007f871a83ff2f in impala::StringCompare (s1=0x0, n1=2455137, s2=0xa20a030 "a", n2=1, len=1) at /home/lv/i1/be/src/runtime/string-value.inline.h:60
      #8  0x00007f871a840996 in impala::StringValue::Compare (this=0x7f868514c990, other=...) at /home/lv/i1/be/src/runtime/string-value.inline.h:77
      #9  0x00007f87195264d7 in impala::StringValue::Gt (this=0x7f868514c990, other=...) at /home/lv/i1/be/src/runtime/string-value.inline.h:122
      #10 0x00007f8719526501 in impala::StringValue::operator> (this=0x7f868514c990, other=...) at /home/lv/i1/be/src/runtime/string-value.inline.h:126
      #11 0x00007f87195c0e4c in impala::Operators::Gt_StringVal_StringVal (c=0x32cf5f0, v1=..., v2=...) at /home/lv/i1/be/src/exprs/operators-ir.cc:229
      #12 0x00007f8719607fb2 in impala::ScalarFnCall::InterpretEval<impala_udf::BooleanVal> (this=0x996fa00, context=0xa242600, row=0xa25e000) at /home/lv/i1/be/src/exprs/scalar-fn-call.cc:589
      #13 0x00007f8719605edf in impala::ScalarFnCall::GetBooleanVal (this=0x996fa00, context=0xa242600, row=0xa25e000) at /home/lv/i1/be/src/exprs/scalar-fn-call.cc:639
      #14 0x00007f87195995c7 in impala::ExprContext::GetBooleanVal (this=0xa242600, row=0xa25e000) at /home/lv/i1/be/src/exprs/expr-context.cc:346
      #15 0x00007f8719fcb012 in impala::ExecNode::EvalConjuncts (ctxs=0x32cf5f8, num_ctxs=1, row=0xa25e000) at /home/lv/i1/be/src/exec/exec-node.cc:440
      #16 0x00007f871a03e918 in impala::HdfsScanner::EvalConjuncts (this=0xa250c00, row=0xa25e000) at /home/lv/i1/be/src/exec/hdfs-scanner.h:350
      #17 0x00007f871a03a9f7 in impala::HdfsScanner::WriteCompleteTuple (this=0xa250c00, pool=0xa254148, fields=0xa258000, tuple=0xa260000, tuple_row=0xa25e000, template_tuple=0x0,
          error_fields=0x7f868514cd00 "", error_in_row=0x7f868514cd6f "") at /home/lv/i1/be/src/exec/hdfs-scanner.cc:248
      #18 0x00007f871a04645c in impala::HdfsScanner::WriteAlignedTuples (this=0xa250c00, pool=0xa254148, tuple_row=0xa25e000, row_size=8, fields=0xa258000, num_tuples=300, max_added_tuples=300,
          slots_per_tuple=1, row_idx_start=0) at /home/lv/i1/be/src/exec/hdfs-scanner-ir.cc:56
      #19 0x00007f871a078e74 in impala::HdfsTextScanner::WriteFields (this=0xa250c00, pool=0xa254148, tuple_row=0xa25e000, num_fields=300, num_tuples=300) at /home/lv/i1/be/src/exec/hdfs-text-scanner.cc:807
      #20 0x00007f871a075282 in impala::HdfsTextScanner::ProcessRange (this=0xa250c00, num_tuples=0x7f868514d28c, past_scan_range=false) at /home/lv/i1/be/src/exec/hdfs-text-scanner.cc:392
      #21 0x00007f871a072ebb in impala::HdfsTextScanner::ProcessSplit (this=0xa250c00) at /home/lv/i1/be/src/exec/hdfs-text-scanner.cc:171
      #22 0x00007f871a010067 in impala::HdfsScanNode::ProcessSplit (this=0xa1e4700, filter_ctxs=std::vector of length 0, capacity 0, scan_range=0xa1ed380) at /home/lv/i1/be/src/exec/hdfs-scan-node.cc:527
      

        Activity

        Hide
        lv Lars Volker added a comment -

        IMPALA-4550: Fix CastExpr analysis for substituted slots

        During slot substitution, the type of the child of a CastExpr can
        change. If the previous child type matched the CastExpr, then the cast
        was flagged as noOp_. During substitution and subsequent re-analysis
        the noOp_ flag was not revisited so that no cast was performed, even
        after it had become necessary.

        The fix is to always set noOp_ to the correct value in
        CastExpr.analyze().

        Change-Id: I7f29cdc359558fad6df455b8eec0e0eaed00e996
        Reviewed-on: http://gerrit.cloudera.org:8080/5267
        Reviewed-by: Alex Behm <alex.behm@cloudera.com>
        Tested-by: Internal Jenkins

        Show
        lv Lars Volker added a comment - IMPALA-4550 : Fix CastExpr analysis for substituted slots During slot substitution, the type of the child of a CastExpr can change. If the previous child type matched the CastExpr, then the cast was flagged as noOp_. During substitution and subsequent re-analysis the noOp_ flag was not revisited so that no cast was performed, even after it had become necessary. The fix is to always set noOp_ to the correct value in CastExpr.analyze(). Change-Id: I7f29cdc359558fad6df455b8eec0e0eaed00e996 Reviewed-on: http://gerrit.cloudera.org:8080/5267 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Internal Jenkins
        Hide
        srus@cloudera.com Silvius Rus added a comment -

        Lars Volker, can you please set the earlier affected versions in Affects Version/s.

        Show
        srus@cloudera.com Silvius Rus added a comment - Lars Volker , can you please set the earlier affected versions in Affects Version/s.
        Hide
        lv Lars Volker added a comment -

        Done

        Show
        lv Lars Volker added a comment - Done

          People

          • Assignee:
            lv Lars Volker
            Reporter:
            lv Lars Volker
          • Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development