Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: Impala 2.8.0
    • Fix Version/s: Impala 2.8.0
    • Component/s: Backend
    • Labels:

      Description

      {{for (ExprContext* ctx : input_expr_ctxs_) DCHECK(ctx->opened()); }} failed in the non-partitioned joins / aggs build (http://sandbox.jenkins.cloudera.com/job/impala-cdh5-trunk-core-non-partitioned-joins-and-aggs/146/console).

      #6  0x00000000027ead4e in google::LogMessageFatal::~LogMessageFatal() ()
      #7  0x0000000001809e6d in impala::AggFnEvaluator::Init (this=0x9327680, agg_fn_ctx=0x549f3b0, dst=0xaf9b000) at /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/src/exprs/agg-fn-evaluator.cc:330
      #8  0x0000000001783841 in impala::AggregationNode::ConstructIntermediateTuple (this=0xce32480) at /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/src/exec/aggregation-node.cc:354
      #9  0x0000000001780afd in impala::AggregationNode::Prepare (this=0xce32480, state=0x1dac1b00) at /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/src/exec/aggregation-node.cc:162
      #10 0x00000000019b184f in impala::PlanFragmentExecutor::PrepareInternal (this=0x2d242a90, request=...) at /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/src/runtime/plan-fragment-executor.cc:218
      #11 0x00000000019afafa in impala::PlanFragmentExecutor::Prepare (this=0x2d242a90, request=...) at /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/src/runtime/plan-fragment-executor.cc:91
      #12 0x000000000150d66e in impala::FragmentMgr::FragmentExecState::Prepare (this=0x2d242700) at /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/src/service/fragment-exec-state.cc:50
      #13 0x000000000150d722 in impala::FragmentMgr::FragmentExecState::Exec (this=0x2d242700) at /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/src/service/fragment-exec-state.cc:57
      #14 0x0000000001504f06 in impala::FragmentMgr::FragmentThread (this=0x98eb580, fragment_instance_id=...) at /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/src/service/fragment-mgr.cc:86
      

        Activity

        Hide
        dhecht Dan Hecht added a comment -

        Tim is on PTO. This is almost certainly fall out from

        commit d7246d64c777384f29fe0f824ee0036b58e8aa2d
        Author: Tim Armstrong <tarmstrong@cloudera.com>
        Date:   Wed Sep 28 13:04:07 2016 -0700
        
            IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions
        

        Michael Ho, Is there any easy fix, or should we just go ahead and backout the change?

        Show
        dhecht Dan Hecht added a comment - Tim is on PTO. This is almost certainly fall out from commit d7246d64c777384f29fe0f824ee0036b58e8aa2d Author: Tim Armstrong <tarmstrong@cloudera.com> Date: Wed Sep 28 13:04:07 2016 -0700 IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions Michael Ho , Is there any easy fix, or should we just go ahead and backout the change?
        Hide
        kwho Michael Ho added a comment -

        Let me take a look first.

        Show
        kwho Michael Ho added a comment - Let me take a look first.
        Hide
        dhecht Dan Hecht added a comment -

        Hmm, actually this job http://sandbox.jenkins.cloudera.com/view/Impala/view/Evergreen-asf-master/job/impala-asf-master-core-non-partitioned-joins-and-aggs/127/

        failed before that change with the same DCHECK, so that's not the cause.

        03:51:36.677 Commencing build of Revision ce4c5f67433ebe050261710920972621d625c81c (origin/master)
        03:51:36.677 Checking out Revision ce4c5f67433ebe050261710920972621d625c81c (origin/master)
        
        #6  0x00000000027e353e in google::LogMessageFatal::~LogMessageFatal() ()
        #7  0x000000000180262d in impala::AggFnEvaluator::Init (this=0xab3eb20, agg_fn_ctx=0xa7c9698, dst=0x11a62000) at /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/src/exprs/agg-fn-evaluator.cc:330
        #8  0x000000000177c1bf in impala::AggregationNode::ConstructIntermediateTuple (this=0xc1393c0) at /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/src/exec/aggregation-node.cc:354
        #9  0x000000000177947b in impala::AggregationNode::Prepare (this=0xc1393c0, state=0x9e52400) at /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/src/exec/aggregation-node.cc:162
        #10 0x00000000019aa3e5 in impala::PlanFragmentExecutor::PrepareInternal (this=0xc164a90, request=...) at /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/src/runtime/plan-fragment-executor.cc:218
        #11 0x00000000019a8690 in impala::PlanFragmentExecutor::Prepare (this=0xc164a90, request=...) at /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/src/runtime/plan-fragment-executor.cc:91
        #12 0x00000000015078a6 in impala::FragmentMgr::FragmentExecState::Prepare (this=0xc164700) at /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/src/service/fragment-exec-state.cc:50
        #13 0x000000000150795a in impala::FragmentMgr::FragmentExecState::Exec (this=0xc164700) at /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/src/service/fragment-exec-state.cc:57
        #14 0x00000000014ff13e in impala::FragmentMgr::FragmentThread (this=0xb847940, fragment_instance_id=...) at /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/src/service/fragment-mgr.cc:86
        
        Show
        dhecht Dan Hecht added a comment - Hmm, actually this job http://sandbox.jenkins.cloudera.com/view/Impala/view/Evergreen-asf-master/job/impala-asf-master-core-non-partitioned-joins-and-aggs/127/ failed before that change with the same DCHECK, so that's not the cause. 03:51:36.677 Commencing build of Revision ce4c5f67433ebe050261710920972621d625c81c (origin/master) 03:51:36.677 Checking out Revision ce4c5f67433ebe050261710920972621d625c81c (origin/master) #6 0x00000000027e353e in google::LogMessageFatal::~LogMessageFatal() () #7 0x000000000180262d in impala::AggFnEvaluator::Init ( this =0xab3eb20, agg_fn_ctx=0xa7c9698, dst=0x11a62000) at /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/src/exprs/agg-fn-evaluator.cc:330 #8 0x000000000177c1bf in impala::AggregationNode::ConstructIntermediateTuple ( this =0xc1393c0) at /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/src/exec/aggregation-node.cc:354 #9 0x000000000177947b in impala::AggregationNode::Prepare ( this =0xc1393c0, state=0x9e52400) at /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/src/exec/aggregation-node.cc:162 #10 0x00000000019aa3e5 in impala::PlanFragmentExecutor::PrepareInternal ( this =0xc164a90, request=...) at /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/src/runtime/plan-fragment-executor.cc:218 #11 0x00000000019a8690 in impala::PlanFragmentExecutor::Prepare ( this =0xc164a90, request=...) at /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/src/runtime/plan-fragment-executor.cc:91 #12 0x00000000015078a6 in impala::FragmentMgr::FragmentExecState::Prepare ( this =0xc164700) at /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/src/service/fragment-exec-state.cc:50 #13 0x000000000150795a in impala::FragmentMgr::FragmentExecState::Exec ( this =0xc164700) at /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/src/service/fragment-exec-state.cc:57 #14 0x00000000014ff13e in impala::FragmentMgr::FragmentThread ( this =0xb847940, fragment_instance_id=...) at /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/src/service/fragment-mgr.cc:86
        Hide
        dhecht Dan Hecht added a comment -

        Builds with this change passed:

        * e3483c4 IMPALA-4441: Divide-by-zero in RuntimeProfile::SummaryStatsCounter::SetStats (by Sailesh 2 days ago)
        

        So these would be the suspects:

        * 381e719 IMPALA-4266: Java udf returning string can give incorrect results (by Tim Armstrong 6 days ago)
        * 10fa472 IMPALA-4302,IMPALA-2379: constant expr arg fixes (by Tim Armstrong 2 weeks ago)
        
        Show
        dhecht Dan Hecht added a comment - Builds with this change passed: * e3483c4 IMPALA-4441: Divide-by-zero in RuntimeProfile::SummaryStatsCounter::SetStats (by Sailesh 2 days ago) So these would be the suspects: * 381e719 IMPALA-4266: Java udf returning string can give incorrect results (by Tim Armstrong 6 days ago) * 10fa472 IMPALA-4302,IMPALA-2379: constant expr arg fixes (by Tim Armstrong 2 weeks ago)
        Hide
        henryr Henry Robinson added a comment -

        FYI, I got the DCHECK wrong. It's this one on the next line:

        for (ExprContext* ctx : input_expr_ctxs_) DCHECK(ctx->opened());

        Show
        henryr Henry Robinson added a comment - FYI, I got the DCHECK wrong. It's this one on the next line: for (ExprContext* ctx : input_expr_ctxs_) DCHECK(ctx->opened());
        Hide
        henryr Henry Robinson added a comment -

        The AggregationNode calls the AggFnEvaluator differently from the PartitionedAggregationNode. The check's from Init(), which is called from AggregationNode::Prepare(). But the check presupposes that AggFnEvaluator::Open() has been called first, but the typical lifecycle is Prepare() -> Open(), not the other way around.

        In the PartitionedAggregationNode, AggFnEvaluator is called, after some indirection, from Open().

        I guess the fix is to move some or all of this block from AggregationNode::Prepare() to Open() :

        if (probe_expr_ctxs_.empty()) {
            // create single intermediate tuple now; we need to output something
            // even if our input is empty
            singleton_intermediate_tuple_ = ConstructIntermediateTuple(); // <<<< Should be called from Open().
            // Check for failures during AggFnEvaluator::Init().
            RETURN_IF_ERROR(state->GetQueryStatus());
            hash_tbl_->Insert(singleton_intermediate_tuple_);
            output_iterator_ = hash_tbl_->Begin();
          }
        
        Show
        henryr Henry Robinson added a comment - The AggregationNode calls the AggFnEvaluator differently from the PartitionedAggregationNode . The check's from Init() , which is called from AggregationNode::Prepare() . But the check presupposes that AggFnEvaluator::Open() has been called first, but the typical lifecycle is Prepare() -> Open() , not the other way around. In the PartitionedAggregationNode , AggFnEvaluator is called, after some indirection, from Open() . I guess the fix is to move some or all of this block from AggregationNode::Prepare() to Open() : if (probe_expr_ctxs_.empty()) { // create single intermediate tuple now; we need to output something // even if our input is empty singleton_intermediate_tuple_ = ConstructIntermediateTuple(); // <<<< Should be called from Open(). // Check for failures during AggFnEvaluator::Init(). RETURN_IF_ERROR(state->GetQueryStatus()); hash_tbl_->Insert(singleton_intermediate_tuple_); output_iterator_ = hash_tbl_->Begin(); }
        Hide
        henryr Henry Robinson added a comment -

        (The guilty party is 10fa472 IMPALA-4302,IMPALA-2379: constant expr arg fixes (by Tim Armstrong 2 weeks ago) if we decide to revert instead).

        Show
        henryr Henry Robinson added a comment - (The guilty party is 10fa472 IMPALA-4302 , IMPALA-2379 : constant expr arg fixes (by Tim Armstrong 2 weeks ago) if we decide to revert instead).
        Hide
        kwho Michael Ho added a comment -

        I agree with Henry Robinson that it's probably the right approach. FWIW, commit 10fa472 already moved the creation of the singleton tuple to PartitionedAggregationNode::Open() for the same reason.

        Show
        kwho Michael Ho added a comment - I agree with Henry Robinson that it's probably the right approach. FWIW, commit 10fa472 already moved the creation of the singleton tuple to PartitionedAggregationNode::Open() for the same reason.
        Hide
        kwho Michael Ho added a comment -

        IMPALA-4452: Always call AggFnEvaluator::Open() before AggFnEvaluator::Init()

        As part of the fix for IMPALA-2379, the expression contexts of
        aggregation function evaluators are expected to be opened before
        their initFn() are called so \ constant arguments can be accessed
        in initFn(). However, the legacy aggregation node wasn't updated
        to follow this order for singleton result tuple (i.e. no group-by).

        This patch fixes the problem by deferring the creation of the
        singleton tuple to a point in AggregationNode::Open() after the
        expression contexts of all aggregate function evaluators have
        been opened. PartitionedAggregationNode() was already updated
        to follow this order.

        This patch also fixes a minor bug in which uninitialized entries
        of agg_fn_ctxs_[] may be accessed in AggregationNode::Close()
        if AggregationNode::Prepare() fails.

        Change-Id: I2f261dee47821c517d8dbe1babf4112462d85807
        Reviewed-on: http://gerrit.cloudera.org:8080/5049
        Reviewed-by: Michael Ho <kwho@cloudera.com>
        Tested-by: Internal Jenkins

        Show
        kwho Michael Ho added a comment - IMPALA-4452 : Always call AggFnEvaluator::Open() before AggFnEvaluator::Init() As part of the fix for IMPALA-2379 , the expression contexts of aggregation function evaluators are expected to be opened before their initFn() are called so \ constant arguments can be accessed in initFn(). However, the legacy aggregation node wasn't updated to follow this order for singleton result tuple (i.e. no group-by). This patch fixes the problem by deferring the creation of the singleton tuple to a point in AggregationNode::Open() after the expression contexts of all aggregate function evaluators have been opened. PartitionedAggregationNode() was already updated to follow this order. This patch also fixes a minor bug in which uninitialized entries of agg_fn_ctxs_[] may be accessed in AggregationNode::Close() if AggregationNode::Prepare() fails. Change-Id: I2f261dee47821c517d8dbe1babf4112462d85807 Reviewed-on: http://gerrit.cloudera.org:8080/5049 Reviewed-by: Michael Ho <kwho@cloudera.com> Tested-by: Internal Jenkins

          People

          • Assignee:
            kwho Michael Ho
            Reporter:
            henryr Henry Robinson
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development