Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-3555

Possible hang in ASAN with huge memcpy

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Cannot Reproduce
    • Affects Version/s: Impala 2.5.0
    • Fix Version/s: None
    • Component/s: Backend

      Description

      Running the ASAN exhaustive build with mem-pools disabled seems to be hanging while running the large strings tests. It looks like there may be an issue with ASAN reallocating very large buffers.

      The query (abridged here, see attachment) seems to be hung:

      select length(group_concat(l_comment, "!")) from (
      select l_comment from tpch_parquet.lineitem union all
      select l_comment from tpch_parquet.lineitem union all
      ... -- 48 more copies...
      select l_comment from tpch_parquet.lineitem) a
      

      It looks like the hang is in the following stack:

      Thread 6 (Thread 0x7fd323df7700 (LWP 13170)):
      #0  0x0000003ad5689c20 in memcpy () from /lib64/libc.so.6
      #1  0x0000000000e85628 in __asan::asan_realloc(void*, unsigned long, __sanitizer::BufferedStackTrace*) () at /data/jenkins/workspace/verify-impala-toolchain-package-build/label/ec2-package-centos-6/toolchain/source/llvm/llvm-3.8.0.src-p1/projects/compiler-rt/lib/asan/asan_allocator.cc:548
      #2  0x0000000000f25102 in realloc () at /data/jenkins/workspace/verify-impala-toolchain-package-build/label/ec2-package-centos-6/toolchain/source/llvm/llvm-3.8.0.src-p1/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:79
      #3  0x0000000001b3ed7b in impala_udf::FunctionContext::Reallocate(unsigned char*, int) () at /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/src/udf/udf.cc:310
      #4  0x000000000187e50f in impala::AggregateFunctions::StringConcatUpdate(impala_udf::FunctionContext*, impala_udf::StringVal const&, impala_udf::StringVal const&, impala_udf::StringVal*) () at /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/src/exprs/aggregate-functions-ir.cc:623
      #5  0x0000000001874fb5 in impala::AggFnEvaluator::Update(impala_udf::FunctionContext*, impala::TupleRow*, impala::Tuple*, void*) () at /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/src/exprs/agg-fn-evaluator.cc:365
      #6  0x00000000017af3cc in impala::PartitionedAggregationNode::UpdateTuple(impala_udf::FunctionContext**, impala::Tuple*, impala::TupleRow*, bool) () at /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/src/exec/partitioned-aggregation-node.cc:1066
      #7  0x00000000017bf0c0 in impala::PartitionedAggregationNode::ProcessBatchNoGrouping(impala::RowBatch*, impala::HashTableCtx const*) () at /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/src/exec/partitioned-aggregation-node-ir.cc:30
      #8  0x00000000017a4428 in impala::PartitionedAggregationNode::Open(impala::RuntimeState*) () at /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/src/exec/partitioned-aggregation-node.cc:337
      #9  0x0000000001b1f8d7 in impala::PlanFragmentExecutor::OpenInternal() () at /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/src/runtime/plan-fragment-executor.cc:355
      #10 0x0000000001b1eace in impala::PlanFragmentExecutor::Open() () at /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/src/runtime/plan-fragment-executor.cc:327
      #11 0x00000000014bf0b3 in impala::FragmentMgr::FragmentExecState::Exec() () at /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/src/service/fragment-exec-state.cc:54
      #12 0x00000000014b18c6 in impala::FragmentMgr::FragmentThread(impala::TUniqueId) () at /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/src/service/fragment-mgr.cc:86
      #13 0x00000000014b6fbe in boost::_mfi::mf1<void, impala::FragmentMgr, impala::TUniqueId>::operator()(impala::FragmentMgr*, impala::TUniqueId) const () at /tmp/impala-deps/boost-1.57.0/include/boost/bind/mem_fn_template.hpp:165
      

      All stacks (both frames-only and including locals) are attached, though the instance that was hung is no longer available. When I detached the debugger the query in question died (unreachable impalad) and the tests continued. I canceled the run though so that I could kick off another.

      As of now I don't have any reason to believe this affects non-ASAN builds.

        Attachments

        1. asan-hang-query-profile.out
          177 kB
          Matthew Jacobs
        2. asan-hang.stacks.out
          972 kB
          Matthew Jacobs
        3. asan-hang.stacks-full.out
          6.40 MB
          Matthew Jacobs
        4. asan-hang.impalad.INFO.zip
          1.17 MB
          Matthew Jacobs

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              mjacobs Matthew Jacobs
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: