Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4458

DCHECK when running test_cancellation.py with MT_DOP > 0

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: Impala 2.8.0
    • Fix Version/s: Impala 2.8.0
    • Component/s: Backend
    • Labels:

      Description

      Repro:

      bin/start-impala-cluster.py --impalad_args="--default_query_options=mt_dop=3"
      tests/run-tests.py query_test/test_cancellation.py 
      

      DCHECK from logs:

      F1109 14:43:08.769706 12831 mem-pool.cc:69] Check failed: chunks_.empty() Must call FreeAll() or AcquireData() for this pool
      

      Stack:

      #7  0x00000000013cc926 in impala::MemPool::~MemPool (this=0x85f7b40, __in_chrg=<optimized out>)
          at /home/abehm/impala/be/src/runtime/mem-pool.cc:69
      #8  0x00000000015640bf in boost::checked_delete<impala::MemPool> (x=0x85f7b40)
          at /home/abehm/impala/toolchain/boost-1.57.0/include/boost/core/checked_delete.hpp:34
      #9  0x0000000001563b99 in boost::scoped_ptr<impala::MemPool>::~scoped_ptr (this=0xa638ea8, __in_chrg=<optimized out>)
          at /home/abehm/impala/toolchain/boost-1.57.0/include/boost/smart_ptr/scoped_ptr.hpp:82
      #10 0x00000000016c9bfa in impala::HdfsParquetScanner::~HdfsParquetScanner (this=0xa638c80, __in_chrg=<optimized out>)
          at /home/abehm/impala/be/src/exec/hdfs-parquet-scanner.h:324
      #11 0x00000000016c9cc6 in impala::HdfsParquetScanner::~HdfsParquetScanner (this=0xa638c80, __in_chrg=<optimized out>)
          at /home/abehm/impala/be/src/exec/hdfs-parquet-scanner.h:324
      #12 0x0000000001677314 in boost::checked_delete<impala::HdfsScanner> (x=0xa638c80)
          at /home/abehm/impala/toolchain/boost-1.57.0/include/boost/core/checked_delete.hpp:34
      #13 0x0000000001676265 in boost::scoped_ptr<impala::HdfsScanner>::~scoped_ptr (this=0x7f56f38e7e40, __in_chrg=<optimized out>)
          at /home/abehm/impala/toolchain/boost-1.57.0/include/boost/smart_ptr/scoped_ptr.hpp:82
      #14 0x0000000001685a83 in boost::scoped_ptr<impala::HdfsScanner>::reset (this=0x9bb4cc0, p=0x0)
          at /home/abehm/impala/toolchain/boost-1.57.0/include/boost/smart_ptr/scoped_ptr.hpp:88
      #15 0x00000000016973b3 in impala::HdfsScanNodeMt::Close (this=0x9bb4800, state=0x953b600)
          at /home/abehm/impala/be/src/exec/hdfs-scan-node-mt.cc:120
      #16 0x0000000001647754 in impala::ExecNode::Close (this=0x9a5ad00, state=0x953b600) at /home/abehm/impala/be/src/exec/exec-node.cc:196
      #17 0x0000000001738da7 in impala::PartitionedAggregationNode::Close (this=0x9a5ad00, state=0x953b600)
          at /home/abehm/impala/be/src/exec/partitioned-aggregation-node.cc:726
      #18 0x00000000019cdf1e in impala::PlanFragmentExecutor::Close (this=0x9ea4390)
          at /home/abehm/impala/be/src/runtime/plan-fragment-executor.cc:546
      #19 0x0000000001526296 in impala::FragmentMgr::FragmentExecState::Exec (this=0x9ea4000)
          at /home/abehm/impala/be/src/service/fragment-exec-state.cc:61
      #20 0x000000000151d9cc in impala::FragmentMgr::FragmentThread (this=0x9e81140, fragment_instance_id=...)
          at /home/abehm/impala/be/src/service/fragment-mgr.cc:86
      #21 0x000000000152174e in boost::_mfi::mf1<void, impala::FragmentMgr, impala::TUniqueId>::operator() (this=0xcbc9800, p=0x9e81140, a1=...)
          at /home/abehm/impala/toolchain/boost-1.57.0/include/boost/bind/mem_fn_template.hpp:165
      #22 0x000000000152150b in boost::_bi::list2<boost::_bi::value<impala::FragmentMgr*>, boost::_bi::value<impala::TUniqueId> >::operator()<boost::_mfi::mf1<void, impala::FragmentMgr, impala::TUniqueId>, boost::_bi::list0> (this=0xcbc9810, f=..., a=...)
          at /home/abehm/impala/toolchain/boost-1.57.0/include/boost/bind/bind.hpp:313
      #23 0x0000000001520e35 in boost::_bi::bind_t<void, boost::_mfi::mf1<void, impala::FragmentMgr, impala::TUniqueId>, boost::_bi::list2<boost::_bi::value<impala::FragmentMgr*>, boost::_bi::value<impala::TUniqueId> > >::operator() (this=0xcbc9800)
          at /home/abehm/impala/toolchain/boost-1.57.0/include/boost/bind/bind_template.hpp:20
      #24 0x00000000015207c8 in boost::detail::function::void_function_obj_invoker0<boost::_bi::bind_t<void, boost::_mfi::mf1<void, impala::FragmentMgr, impala::TUniqueId>, boost::_bi::list2<boost::_bi::value<impala::FragmentMgr*>, boost::_bi::value<impala::TUniqueId> > >, void>::invoke (function_obj_ptr=...) at /home/abehm/impala/toolchain/boost-1.57.0/include/boost/function/function_template.hpp:153
      #25 0x0000000001335cf2 in boost::function0<void>::operator() (this=0x7f56f38e8d30)
          at /home/abehm/impala/toolchain/boost-1.57.0/include/boost/function/function_template.hpp:767
      

        Issue Links

          Activity

          Hide
          alex.behm Alexander Behm added a comment -

          commit c97bffcce1e3d053ad6152dae300bf5233507f34
          Author: Alex Behm <alex.behm@cloudera.com>
          Date: Tue Nov 15 18:27:20 2016 -0800

          IMPALA-4458: Fix resource cleanup of cancelled mt scan nodes.

          The bug was that HdfsScanNodeMt::Close() did not properly
          clean up all in-flight resources when called through the
          query cancellation path.

          The main change is to clean up all resources when passing
          a NULL batch into HdfsparquetScanner::Close() which also
          needed similar changes in the scanner context.

          Testing: Ran test_cancellation.py, test_scanners.py and
          test_nested_types.py with MT_DOP=3. Added a test query
          with a limit that was failing before.
          A regular private hdfs/core test run succeeded.

          Change-Id: Ib32f87b3289ed9e8fc2db0885675845e11207438
          Reviewed-on: http://gerrit.cloudera.org:8080/5274
          Reviewed-by: Alex Behm <alex.behm@cloudera.com>
          Tested-by: Internal Jenkins

          Show
          alex.behm Alexander Behm added a comment - commit c97bffcce1e3d053ad6152dae300bf5233507f34 Author: Alex Behm <alex.behm@cloudera.com> Date: Tue Nov 15 18:27:20 2016 -0800 IMPALA-4458 : Fix resource cleanup of cancelled mt scan nodes. The bug was that HdfsScanNodeMt::Close() did not properly clean up all in-flight resources when called through the query cancellation path. The main change is to clean up all resources when passing a NULL batch into HdfsparquetScanner::Close() which also needed similar changes in the scanner context. Testing: Ran test_cancellation.py, test_scanners.py and test_nested_types.py with MT_DOP=3. Added a test query with a limit that was failing before. A regular private hdfs/core test run succeeded. Change-Id: Ib32f87b3289ed9e8fc2db0885675845e11207438 Reviewed-on: http://gerrit.cloudera.org:8080/5274 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Internal Jenkins
          Hide
          tarmstrong Tim Armstrong added a comment -

          I ran into this too - glad that we have a test that catches it. I think I know what the problem is if you haven't taken a look at it yet.

          Show
          tarmstrong Tim Armstrong added a comment - I ran into this too - glad that we have a test that catches it. I think I know what the problem is if you haven't taken a look at it yet.

            People

            • Assignee:
              alex.behm Alexander Behm
              Reporter:
              alex.behm Alexander Behm
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development