Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-5246

Queries failing with "Process: memory limit exceeded" during ASAN builds

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: Impala 2.9.0
    • Fix Version/s: Impala 2.9.0
    • Component/s: Backend
    • Labels:

      Description

      #
      # A fatal error has been detected by the Java Runtime Environment:
      #
      #  SIGSEGV (0xb) at pc=0x00007f4eef4dbc33, pid=1812, tid=139979017721600
      #
      # JRE version: Java(TM) SE Runtime Environment (7.0_75-b13) (build 1.7.0_75-b13)
      # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.75-b04 mixed mode linux-amd64 compressed oops)
      # Problematic frame:
      # C  [libTestUdfs.1812.4.so+0x5c33]  MemTestClose(impala_udf::FunctionContext*, impala_udf::FunctionContext::FunctionStateScope)+0x33
      #
      # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
      #
      # An error report file with more information is saved as:
      # /data/jenkins/workspace/impala-umbrella-build-and-test/repos/hadoop-lzo/hs_err_pid1812.log
      #
      # If you would like to submit a bug report, please visit:
      #   http://bugreport.sun.com/bugreport/crash.jsp
      # The crash happened outside the Java Virtual Machine in native code.
      # See problematic frame for where to report the bug.
      #
      

      Still trying to find the core file. Will provide backtrace if I can find it.

        Issue Links

          Activity

          Hide
          kwho Michael Ho added a comment -

          Reproduced locally with ASAN build. Not sure why no core was generated. Looking more into it.

          Stack: [0x00007f2ca18de000,0x00007f2ca20df000],  sp=0x00007f2ca20d8550,  free space=8169k
          Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
          C  [libTestUdfs.16688.0.so+0x3cb1]  MemTestClose(impala_udf::FunctionContext*, impala_udf::FunctionContext::FunctionStateScope)+0x2e
          C  [impalad+0x1772306]  impala::ScalarFnCall::Close(impala::RuntimeState*, impala::ExprContext*, impala_udf::FunctionContext::FunctionStateScope)+0x86
          C  [impalad+0x16b07ec]  impala::Expr::Close(impala::RuntimeState*, impala::ExprContext*, impala_udf::FunctionContext::FunctionStateScope)+0x9c
          C  [impalad+0x16bfdca]  impala::ExprContext::Close(impala::RuntimeState*)+0xaa
          C  [impalad+0x10b1c6e]  Java_org_apache_impala_service_FeSupport_NativeEvalExprsWithoutRow+0x102e
          j  org.apache.impala.service.FeSupport.NativeEvalExprsWithoutRow([B[B)[B+0
          j  org.apache.impala.service.FeSupport.EvalExprsWithoutRow([B[B)[B+2
          j  org.apache.impala.service.FeSupport.EvalExprWithoutRow(Lorg/apache/impala/analysis/Expr;Lorg/apache/impala/thrift/TQueryCtx;)Lorg/apache/impala/thrift/TColumnValue;+59
          j  org.apache.impala.service.FeSupport.EvalPredicate(Lorg/apache/impala/analysis/Expr;Lorg/apache/impala/thrift/TQueryCtx;)Z+36
          j  org.apache.impala.analysis.Analyzer.isTrueWithNullSlots(Lorg/apache/impala/analysis/Expr;)Z+88
          j  org.apache.impala.planner.HdfsScanNode.computeDictionaryFilterConjuncts(Lorg/apache/impala/analysis/Analyzer;)V+122
          j  org.apache.impala.planner.HdfsScanNode.init(Lorg/apache/impala/analysis/Analyzer;)V+22
          j  org.apache.impala.planner.SingleNodePlanner.createHdfsScanPlan(Lorg/apache/impala/analysis/TableRef;ZLorg/apache/impala/analysis/Analyzer;)Lorg/apache/impala/planner/PlanNode;+373
          j  org.apache.impala.planner.SingleNodePlanner.createScanNode(Lorg/apache/impala/analysis/TableRef;ZLorg/apache/impala/analysis/Analyzer;)Lorg/apache/impala/planner/PlanNode;+21
          j  org.apache.impala.planner.SingleNodePlanner.createTableRefNode(Lorg/apache/impala/analysis/TableRef;ZLorg/apache/impala/analysis/Analyzer;)Lorg/apache/impala/planner/PlanNode;+14
          j  org.apache.impala.planner.SingleNodePlanner.createTableRefsPlan(Ljava/util/List;Ljava/util/List;ZLorg/apache/impala/analysis/Analyzer;)Lorg/apache/impala/planner/PlanNode;+41
          j  org.apache.impala.planner.SingleNodePlanner.createSelectPlan(Lorg/apache/impala/analysis/SelectStmt;Lorg/apache/impala/analysis/Analyzer;)Lorg/apache/impala/planner/PlanNode;+203
          j  org.apache.impala.planner.SingleNodePlanner.createQueryPlan(Lorg/apache/impala/analysis/QueryStmt;Lorg/apache/impala/analysis/Analyzer;Z)Lorg/apache/impala/planner/PlanNode;+31
          j  org.apache.impala.planner.SingleNodePlanner.createSingleNodePlan()Lorg/apache/impala/planner/PlanNode;+104
          j  org.apache.impala.planner.Planner.createPlan()Ljava/util/ArrayList;+25
          j  org.apache.impala.service.Frontend.createExecRequest(Lorg/apache/impala/planner/Planner;Ljava/lang/StringBuilder;)Lorg/apache/impala/thrift/TQueryExecRequest;+111
          j  org.apache.impala.service.Frontend.createExecRequest(Lorg/apache/impala/thrift/TQueryCtx;Ljava/lang/StringBuilder;)Lorg/apache/impala/thrift/TExecRequest;+424
          j  org.apache.impala.service.JniFrontend.createExecRequest([B)[B+30
          v  ~StubRoutines::call_stub
          V  [libjvm.so+0x600045]  JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*, Thread*)+0x365
          V  [libjvm.so+0x5feaa8]  JavaCalls::call(JavaValue*, methodHandle, JavaCallArguments*, Thread*)+0x28
          V  [libjvm.so+0x639524]  jni_invoke_nonstatic(JNIEnv_*, JavaValue*, _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*)+0x2b4
          V  [libjvm.so+0x64a209]  jni_CallObjectMethodV+0xe9
          C  [impalad+0xccb234]  JNIEnv_::CallObjectMethod(_jobject*, _jmethodID*, ...)+0x134
          C  [impalad+0x10a34d8]  impala::Status impala::JniUtil::CallJniMethod<impala::TQueryCtx, impala::TExecRequest>(_jobject* const&, _jmethodID* const&, impala::TQueryCtx const&, impala::TExecRequest*)+0x198
          C  [impalad+0x10a019f]  impala::Frontend::GetExecRequest(impala::TQueryCtx const&, impala::TExecRequest*)+0x1f
          C  [impalad+0x10c6b84]  impala::ImpalaServer::ExecuteInternal(impala::TQueryCtx const&, std::shared_ptr<impala::ImpalaServer::SessionState>, bool*, std::shared_ptr<impala::ImpalaServer::QueryExecState>*)+0x364
          C  [impalad+0x10c6329]  impala::ImpalaServer::Execute(impala::TQueryCtx*, std::shared_ptr<impala::ImpalaServer::SessionState>, std::shared_ptr<impala::ImpalaServer::QueryExecState>*)+0x1a9
          C  [impalad+0x11520de]  impala::ImpalaServer::query(beeswax::QueryHandle&, beeswax::Query const&)+0x2ae
          C  [impalad+0x18493dd]  beeswax::BeeswaxServiceProcessor::process_query(int, apache::thrift::protocol::TProtocol*, apache::thrift::protocol::TProtocol*, void*)+0x3ed
          C  [impalad+0x1848eaa]  beeswax::BeeswaxServiceProcessor::dispatchCall(apache::thrift::protocol::TProtocol*, apache::thrift::protocol::TProtocol*, std::string const&, int, void*)+0x3ba
          C  [impalad+0x18220ab]  impala::ImpalaServiceProcessor::dispatchCall(apache::thrift::protocol::TProtocol*, apache::thrift::protocol::TProtocol*, std::string const&, int, void*)+0x12b
          
          Show
          kwho Michael Ho added a comment - Reproduced locally with ASAN build. Not sure why no core was generated. Looking more into it. Stack: [0x00007f2ca18de000,0x00007f2ca20df000], sp=0x00007f2ca20d8550, free space=8169k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) C [libTestUdfs.16688.0.so+0x3cb1] MemTestClose(impala_udf::FunctionContext*, impala_udf::FunctionContext::FunctionStateScope)+0x2e C [impalad+0x1772306] impala::ScalarFnCall::Close(impala::RuntimeState*, impala::ExprContext*, impala_udf::FunctionContext::FunctionStateScope)+0x86 C [impalad+0x16b07ec] impala::Expr::Close(impala::RuntimeState*, impala::ExprContext*, impala_udf::FunctionContext::FunctionStateScope)+0x9c C [impalad+0x16bfdca] impala::ExprContext::Close(impala::RuntimeState*)+0xaa C [impalad+0x10b1c6e] Java_org_apache_impala_service_FeSupport_NativeEvalExprsWithoutRow+0x102e j org.apache.impala.service.FeSupport.NativeEvalExprsWithoutRow([B[B)[B+0 j org.apache.impala.service.FeSupport.EvalExprsWithoutRow([B[B)[B+2 j org.apache.impala.service.FeSupport.EvalExprWithoutRow(Lorg/apache/impala/analysis/Expr;Lorg/apache/impala/thrift/TQueryCtx;)Lorg/apache/impala/thrift/TColumnValue;+59 j org.apache.impala.service.FeSupport.EvalPredicate(Lorg/apache/impala/analysis/Expr;Lorg/apache/impala/thrift/TQueryCtx;)Z+36 j org.apache.impala.analysis.Analyzer.isTrueWithNullSlots(Lorg/apache/impala/analysis/Expr;)Z+88 j org.apache.impala.planner.HdfsScanNode.computeDictionaryFilterConjuncts(Lorg/apache/impala/analysis/Analyzer;)V+122 j org.apache.impala.planner.HdfsScanNode.init(Lorg/apache/impala/analysis/Analyzer;)V+22 j org.apache.impala.planner.SingleNodePlanner.createHdfsScanPlan(Lorg/apache/impala/analysis/TableRef;ZLorg/apache/impala/analysis/Analyzer;)Lorg/apache/impala/planner/PlanNode;+373 j org.apache.impala.planner.SingleNodePlanner.createScanNode(Lorg/apache/impala/analysis/TableRef;ZLorg/apache/impala/analysis/Analyzer;)Lorg/apache/impala/planner/PlanNode;+21 j org.apache.impala.planner.SingleNodePlanner.createTableRefNode(Lorg/apache/impala/analysis/TableRef;ZLorg/apache/impala/analysis/Analyzer;)Lorg/apache/impala/planner/PlanNode;+14 j org.apache.impala.planner.SingleNodePlanner.createTableRefsPlan(Ljava/util/List;Ljava/util/List;ZLorg/apache/impala/analysis/Analyzer;)Lorg/apache/impala/planner/PlanNode;+41 j org.apache.impala.planner.SingleNodePlanner.createSelectPlan(Lorg/apache/impala/analysis/SelectStmt;Lorg/apache/impala/analysis/Analyzer;)Lorg/apache/impala/planner/PlanNode;+203 j org.apache.impala.planner.SingleNodePlanner.createQueryPlan(Lorg/apache/impala/analysis/QueryStmt;Lorg/apache/impala/analysis/Analyzer;Z)Lorg/apache/impala/planner/PlanNode;+31 j org.apache.impala.planner.SingleNodePlanner.createSingleNodePlan()Lorg/apache/impala/planner/PlanNode;+104 j org.apache.impala.planner.Planner.createPlan()Ljava/util/ArrayList;+25 j org.apache.impala.service.Frontend.createExecRequest(Lorg/apache/impala/planner/Planner;Ljava/lang/StringBuilder;)Lorg/apache/impala/thrift/TQueryExecRequest;+111 j org.apache.impala.service.Frontend.createExecRequest(Lorg/apache/impala/thrift/TQueryCtx;Ljava/lang/StringBuilder;)Lorg/apache/impala/thrift/TExecRequest;+424 j org.apache.impala.service.JniFrontend.createExecRequest([B)[B+30 v ~StubRoutines::call_stub V [libjvm.so+0x600045] JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*, Thread*)+0x365 V [libjvm.so+0x5feaa8] JavaCalls::call(JavaValue*, methodHandle, JavaCallArguments*, Thread*)+0x28 V [libjvm.so+0x639524] jni_invoke_nonstatic(JNIEnv_*, JavaValue*, _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*)+0x2b4 V [libjvm.so+0x64a209] jni_CallObjectMethodV+0xe9 C [impalad+0xccb234] JNIEnv_::CallObjectMethod(_jobject*, _jmethodID*, ...)+0x134 C [impalad+0x10a34d8] impala::Status impala::JniUtil::CallJniMethod<impala::TQueryCtx, impala::TExecRequest>(_jobject* const&, _jmethodID* const&, impala::TQueryCtx const&, impala::TExecRequest*)+0x198 C [impalad+0x10a019f] impala::Frontend::GetExecRequest(impala::TQueryCtx const&, impala::TExecRequest*)+0x1f C [impalad+0x10c6b84] impala::ImpalaServer::ExecuteInternal(impala::TQueryCtx const&, std::shared_ptr<impala::ImpalaServer::SessionState>, bool*, std::shared_ptr<impala::ImpalaServer::QueryExecState>*)+0x364 C [impalad+0x10c6329] impala::ImpalaServer::Execute(impala::TQueryCtx*, std::shared_ptr<impala::ImpalaServer::SessionState>, std::shared_ptr<impala::ImpalaServer::QueryExecState>*)+0x1a9 C [impalad+0x11520de] impala::ImpalaServer::query(beeswax::QueryHandle&, beeswax::Query const&)+0x2ae C [impalad+0x18493dd] beeswax::BeeswaxServiceProcessor::process_query(int, apache::thrift::protocol::TProtocol*, apache::thrift::protocol::TProtocol*, void*)+0x3ed C [impalad+0x1848eaa] beeswax::BeeswaxServiceProcessor::dispatchCall(apache::thrift::protocol::TProtocol*, apache::thrift::protocol::TProtocol*, std::string const&, int, void*)+0x3ba C [impalad+0x18220ab] impala::ImpalaServiceProcessor::dispatchCall(apache::thrift::protocol::TProtocol*, apache::thrift::protocol::TProtocol*, std::string const&, int, void*)+0x12b
          Hide
          kwho Michael Ho added a comment -

          It appears that some recent changes caused the mem limit to be much lower and the UDF failed to initialize. It exposed an existing bug in test-udf.cc. The fix should be straightforward.
          However, some other tests (e.g. test_udf_constant_folding) also failed due to mem limit and it's unclear if that's intended.

          The MemLimit number from the exception message also contains bogus value. cc'ing Tim Armstrong.

          =================================================== FAILURES ====================================================
           TestUdfExecution.test_udf_constant_folding[exec_option: {'disable_codegen': False, 'exec_single_node_rows_threshold': 0, 'enable_expr_rewrites': False} | table_format: text/none]
          tests/query_test/test_udfs.py:357: in test_udf_constant_folding
              result = self.execute_query_expect_success(self.client, query, exec_options)
          tests/common/impala_test_suite.py:475: in wrapper
              return function(*args, **kwargs)
          tests/common/impala_test_suite.py:482: in execute_query_expect_success
              result = cls.__execute_query(impalad_client, query, query_options)
          tests/common/impala_test_suite.py:567: in __execute_query
              return impalad_client.execute(query, user=user)
          tests/common/impala_connection.py:160: in execute
              return self.__beeswax_client.execute(sql_stmt, user=user)
          tests/beeswax/impala_beeswax.py:173: in execute
              handle = self.__execute_query(query_string.strip(), user=user)
          tests/beeswax/impala_beeswax.py:337: in __execute_query
              handle = self.execute_query_async(query_string, user=user)
          tests/beeswax/impala_beeswax.py:333: in execute_query_async
              return self.__do_rpc(lambda: self.imp_service.query(query,))
          tests/beeswax/impala_beeswax.py:458: in __do_rpc
              raise ImpalaBeeswaxException(self.__build_error_message(b), b)
          E   ImpalaBeeswaxException: ImpalaBeeswaxException:
          E    INNER EXCEPTION: <class 'beeswaxd.ttypes.BeeswaxException'>
          E    MESSAGE: ExecPlanRequest rpc query_id=ff4a95fc6a7800d8:846524da00000000 instance_id=ff4a95fc6a7800d8:846524da00000000 failed: Memory limit exceeded: Query ff4a95fc6a7800d8:846524da00000000 could not start because the backend Impala daemon is over its memory limit
          E   Error occurred on backend kwho-desktop:22000
          E   Memory left in process limit: -132692090900.00 B
          E   Process: memory limit exceeded. Limit=8.35 GB Total=131.93 GB Peak=136.23 GB
          E     RequestPool=default-pool: Total=131.93 GB Peak=136.23 GB
          E       Query(374368d02e93da17:13a99a1d00000000): memory limit exceeded. Limit=1.00 MB Total=131.93 GB Peak=131.93 GB
          E         Block Manager: Total=0 Peak=0
          E         Fragment 374368d02e93da17:13a99a1d00000003: Total=131.93 GB Peak=131.93 GB
          E           AGGREGATION_NODE (id=1): Total=131.93 GB Peak=131.93 GB
          E             Exprs: Total=131.93 GB Peak=131.93 GB
          E           HDFS_SCAN_NODE (id=0): Total=24.00 KB Peak=61.00 KB
          E           DataStreamSender (dst_id=2): Total=14.22 KB Peak=14.22 KB
          E       Query(ff4a95fc6a7800d8:846524da00000000): Total=0 Peak=0
          E     RequestPool=fe-eval-exprs: Total=0 Peak=20.00 MB
          
          Show
          kwho Michael Ho added a comment - It appears that some recent changes caused the mem limit to be much lower and the UDF failed to initialize. It exposed an existing bug in test-udf.cc. The fix should be straightforward. However, some other tests (e.g. test_udf_constant_folding) also failed due to mem limit and it's unclear if that's intended. The MemLimit number from the exception message also contains bogus value. cc'ing Tim Armstrong . =================================================== FAILURES ==================================================== TestUdfExecution.test_udf_constant_folding[exec_option: {'disable_codegen': False, 'exec_single_node_rows_threshold': 0, 'enable_expr_rewrites': False} | table_format: text/none] tests/query_test/test_udfs.py:357: in test_udf_constant_folding result = self.execute_query_expect_success(self.client, query, exec_options) tests/common/impala_test_suite.py:475: in wrapper return function(*args, **kwargs) tests/common/impala_test_suite.py:482: in execute_query_expect_success result = cls.__execute_query(impalad_client, query, query_options) tests/common/impala_test_suite.py:567: in __execute_query return impalad_client.execute(query, user=user) tests/common/impala_connection.py:160: in execute return self.__beeswax_client.execute(sql_stmt, user=user) tests/beeswax/impala_beeswax.py:173: in execute handle = self.__execute_query(query_string.strip(), user=user) tests/beeswax/impala_beeswax.py:337: in __execute_query handle = self.execute_query_async(query_string, user=user) tests/beeswax/impala_beeswax.py:333: in execute_query_async return self.__do_rpc(lambda: self.imp_service.query(query,)) tests/beeswax/impala_beeswax.py:458: in __do_rpc raise ImpalaBeeswaxException(self.__build_error_message(b), b) E ImpalaBeeswaxException: ImpalaBeeswaxException: E INNER EXCEPTION: <class 'beeswaxd.ttypes.BeeswaxException'> E MESSAGE: ExecPlanRequest rpc query_id=ff4a95fc6a7800d8:846524da00000000 instance_id=ff4a95fc6a7800d8:846524da00000000 failed: Memory limit exceeded: Query ff4a95fc6a7800d8:846524da00000000 could not start because the backend Impala daemon is over its memory limit E Error occurred on backend kwho-desktop:22000 E Memory left in process limit: -132692090900.00 B E Process: memory limit exceeded. Limit=8.35 GB Total=131.93 GB Peak=136.23 GB E RequestPool=default-pool: Total=131.93 GB Peak=136.23 GB E Query(374368d02e93da17:13a99a1d00000000): memory limit exceeded. Limit=1.00 MB Total=131.93 GB Peak=131.93 GB E Block Manager: Total=0 Peak=0 E Fragment 374368d02e93da17:13a99a1d00000003: Total=131.93 GB Peak=131.93 GB E AGGREGATION_NODE (id=1): Total=131.93 GB Peak=131.93 GB E Exprs: Total=131.93 GB Peak=131.93 GB E HDFS_SCAN_NODE (id=0): Total=24.00 KB Peak=61.00 KB E DataStreamSender (dst_id=2): Total=14.22 KB Peak=14.22 KB E Query(ff4a95fc6a7800d8:846524da00000000): Total=0 Peak=0 E RequestPool=fe-eval-exprs: Total=0 Peak=20.00 MB
          Hide
          kwho Michael Ho added a comment -

          The crash in MemTestClose() is due to wrong assumption of the UDF that prepare() succeeded.

          void MemTestClose(FunctionContext* context, FunctionContext::FunctionStateScope scope) {
            if (scope == FunctionContext::THREAD_LOCAL) {
              int64_t* total = reinterpret_cast<int64_t*>(
                  context->GetFunctionState(FunctionContext::THREAD_LOCAL)); <<---- total could be null if Expr::Prepare() failed internally.
              context->Free(*total);
              context->Free(reinterpret_cast<uint8_t*>(total));
              context->SetFunctionState(scope, NULL);
            }
          }
          
          Show
          kwho Michael Ho added a comment - The crash in MemTestClose() is due to wrong assumption of the UDF that prepare() succeeded. void MemTestClose(FunctionContext* context, FunctionContext::FunctionStateScope scope) { if (scope == FunctionContext::THREAD_LOCAL) { int64_t* total = reinterpret_cast<int64_t*>( context->GetFunctionState(FunctionContext::THREAD_LOCAL)); <<---- total could be null if Expr::Prepare() failed internally. context->Free(*total); context->Free(reinterpret_cast<uint8_t*>(total)); context->SetFunctionState(scope, NULL); } }
          Hide
          srus@cloudera.com Silvius Rus added a comment -

          Taras Bobrovytsky, can you set the appropriate target version.

          Show
          srus@cloudera.com Silvius Rus added a comment - Taras Bobrovytsky , can you set the appropriate target version.
          Hide
          tarasbob Taras Bobrovytsky added a comment -

          Looks like Michael has a patch for this already https://gerrit.cloudera.org/#/c/6757/. Assigning it to him.

          Show
          tarasbob Taras Bobrovytsky added a comment - Looks like Michael has a patch for this already https://gerrit.cloudera.org/#/c/6757/ . Assigning it to him.
          Hide
          kwho Michael Ho added a comment -

          Taras Bobrovytsky, the patch is for fixing the crash of MemTestClose(). Can you please continue the investigation of failures of TestUdfExecution.test_udf_constant_folding ?

          Show
          kwho Michael Ho added a comment - Taras Bobrovytsky , the patch is for fixing the crash of MemTestClose(). Can you please continue the investigation of failures of TestUdfExecution.test_udf_constant_folding ?
          Hide
          dhecht Dan Hecht added a comment -

          Taras Bobrovytsky, Michael Ho, where are we on figuring this out?

          Show
          dhecht Dan Hecht added a comment - Taras Bobrovytsky , Michael Ho , where are we on figuring this out?
          Hide
          kwho Michael Ho added a comment - - edited

          The following commit fixes the crash in MemTestClose() in IMPALA-5246.

          IMPALA-5246: MemTestClose() should handle Expr's preparation failure
          UDF may fail to initialize due exceeding memory limit or
          other reasons. In which case, its Prepare() function may
          not have been called and its thread local state may not be
          initialized.

          MemTestClose() in test-udf.cc made the wrong assumption that
          the thread local states are always initialized. This may lead
          to de-referencing null pointer in Close(). This change fixes
          this issue by checking the thread local state is not null and
          returns early if so. Also sets the fragment or thread local
          states in FunctionContext to nullptr after freeing them in
          various built-in's Close() functions.

          Change-Id: Id2c689246ed4f8dd38f104fa35904f3926a7039c
          Reviewed-on: http://gerrit.cloudera.org:8080/6757
          Reviewed-by: Dan Hecht <dhecht@cloudera.com>
          Reviewed-by: Attila Jeges <attilaj@cloudera.com>
          Tested-by: Impala Public Jenkins

          The latest ASAN build failed with the following message which is similar to the failure in TestUdfExecution.test_udf_constant_folding,
          showing a really huge consumption value in the Exprs MemTracker.

          02:18:07 shell/util.py:95: in run_impala_shell_cmd
          02:18:07     result.stderr)
          02:18:07 E   AssertionError: Cmd -r -q "select 1" was expected to succeed: Starting Impala Shell without Kerberos authentication
          02:18:07 E   Connected to localhost:21000
          02:18:07 E   Server version: impalad version 2.9.0-SNAPSHOT DEBUG (build 25ba76287e36181c3fce81239763532f98fc9420)
          02:18:07 E   Invalidating Metadata
          02:18:07 E   Query: invalidate metadata
          02:18:07 E   Query submitted at: 2017-05-08 22:57:53 (Coordinator: http://impala-boost-static-burst-slave-1591.vpc.cloudera.com:25000)
          02:18:07 E   Query progress can be monitored at: http://impala-boost-static-burst-slave-1591.vpc.cloudera.com:25000/query_plan?query_id=1048cc8cf2b5782f:1c39c7e200000000
          02:18:07 E   Fetched 0 row(s) in 4.26s
          02:18:07 E   Query: select 1
          02:18:07 E   Query submitted at: 2017-05-08 22:57:57 (Coordinator: http://impala-boost-static-burst-slave-1591.vpc.cloudera.com:25000)
          02:18:07 E   ERROR: ExecPlanRequest rpc query_id=c846f88375486aaa:14043ffb00000000 instance_id=c846f88375486aaa:14043ffb00000000 failed: Memory limit exceeded: Query c846f88375486aaa:14043ffb00000000 could not start because the backend Impala daemon is over its memory limit
          02:18:07 E   Error occurred on backend impala-boost-static-burst-slave-1591.vpc.cloudera.com:22000
          02:18:07 E   Memory left in process limit: -89321034284.00 B
          02:18:07 E   Process: memory limit exceeded. Limit=17.89 GB Total=101.07 GB Peak=131.84 GB
          02:18:07 E     RequestPool=fe-eval-exprs: Total=0 Peak=20.00 MB
          02:18:07 E     RequestPool=default-pool: Total=101.07 GB Peak=131.84 GB
          02:18:07 E       Query(f441e77999e36533:2dd37c6600000000): memory limit exceeded. Limit=1.00 MB Total=101.07 GB Peak=101.07 GB
          02:18:07 E         Block Manager: Total=0 Peak=0
          02:18:07 E         Fragment f441e77999e36533:2dd37c6600000003: Total=101.07 GB Peak=101.07 GB
          02:18:07 E           AGGREGATION_NODE (id=1): Total=101.07 GB Peak=101.07 GB
          02:18:07 E             Exprs: Total=101.07 GB Peak=101.07 GB
          02:18:07 E           HDFS_SCAN_NODE (id=0): Total=24.00 KB Peak=61.00 KB
          02:18:07 E           DataStreamSender (dst_id=2): Total=14.22 KB Peak=14.22 KB
          02:18:07 E           CodeGen: Total=3.99 KB Peak=297.50 KB
          02:18:07 E       Query(c846f88375486aaa:14043ffb00000000): Total=0 Peak=0
          
          Show
          kwho Michael Ho added a comment - - edited The following commit fixes the crash in MemTestClose() in IMPALA-5246 . IMPALA-5246 : MemTestClose() should handle Expr's preparation failure UDF may fail to initialize due exceeding memory limit or other reasons. In which case, its Prepare() function may not have been called and its thread local state may not be initialized. MemTestClose() in test-udf.cc made the wrong assumption that the thread local states are always initialized. This may lead to de-referencing null pointer in Close(). This change fixes this issue by checking the thread local state is not null and returns early if so. Also sets the fragment or thread local states in FunctionContext to nullptr after freeing them in various built-in's Close() functions. Change-Id: Id2c689246ed4f8dd38f104fa35904f3926a7039c Reviewed-on: http://gerrit.cloudera.org:8080/6757 Reviewed-by: Dan Hecht <dhecht@cloudera.com> Reviewed-by: Attila Jeges <attilaj@cloudera.com> Tested-by: Impala Public Jenkins The latest ASAN build failed with the following message which is similar to the failure in TestUdfExecution.test_udf_constant_folding, showing a really huge consumption value in the Exprs MemTracker. 02:18:07 shell/util.py:95: in run_impala_shell_cmd 02:18:07 result.stderr) 02:18:07 E AssertionError: Cmd -r -q "select 1" was expected to succeed: Starting Impala Shell without Kerberos authentication 02:18:07 E Connected to localhost:21000 02:18:07 E Server version: impalad version 2.9.0-SNAPSHOT DEBUG (build 25ba76287e36181c3fce81239763532f98fc9420) 02:18:07 E Invalidating Metadata 02:18:07 E Query: invalidate metadata 02:18:07 E Query submitted at: 2017-05-08 22:57:53 (Coordinator: http://impala-boost-static-burst-slave-1591.vpc.cloudera.com:25000) 02:18:07 E Query progress can be monitored at: http://impala-boost-static-burst-slave-1591.vpc.cloudera.com:25000/query_plan?query_id=1048cc8cf2b5782f:1c39c7e200000000 02:18:07 E Fetched 0 row(s) in 4.26s 02:18:07 E Query: select 1 02:18:07 E Query submitted at: 2017-05-08 22:57:57 (Coordinator: http://impala-boost-static-burst-slave-1591.vpc.cloudera.com:25000) 02:18:07 E ERROR: ExecPlanRequest rpc query_id=c846f88375486aaa:14043ffb00000000 instance_id=c846f88375486aaa:14043ffb00000000 failed: Memory limit exceeded: Query c846f88375486aaa:14043ffb00000000 could not start because the backend Impala daemon is over its memory limit 02:18:07 E Error occurred on backend impala-boost-static-burst-slave-1591.vpc.cloudera.com:22000 02:18:07 E Memory left in process limit: -89321034284.00 B 02:18:07 E Process: memory limit exceeded. Limit=17.89 GB Total=101.07 GB Peak=131.84 GB 02:18:07 E RequestPool=fe-eval-exprs: Total=0 Peak=20.00 MB 02:18:07 E RequestPool=default-pool: Total=101.07 GB Peak=131.84 GB 02:18:07 E Query(f441e77999e36533:2dd37c6600000000): memory limit exceeded. Limit=1.00 MB Total=101.07 GB Peak=101.07 GB 02:18:07 E Block Manager: Total=0 Peak=0 02:18:07 E Fragment f441e77999e36533:2dd37c6600000003: Total=101.07 GB Peak=101.07 GB 02:18:07 E AGGREGATION_NODE (id=1): Total=101.07 GB Peak=101.07 GB 02:18:07 E Exprs: Total=101.07 GB Peak=101.07 GB 02:18:07 E HDFS_SCAN_NODE (id=0): Total=24.00 KB Peak=61.00 KB 02:18:07 E DataStreamSender (dst_id=2): Total=14.22 KB Peak=14.22 KB 02:18:07 E CodeGen: Total=3.99 KB Peak=297.50 KB 02:18:07 E Query(c846f88375486aaa:14043ffb00000000): Total=0 Peak=0
          Hide
          dknupp David Knupp added a comment -

          We are seeing this is in different tests on ASAN builds, not just UDF, so changed the summary to be more specific.

          Show
          dknupp David Knupp added a comment - We are seeing this is in different tests on ASAN builds, not just UDF, so changed the summary to be more specific.
          Show
          zamsden Zach Amsden added a comment - Saw two more failures, http://sandbox.jenkins.cloudera.com/job/impala-asf-master-core-asan/334/consoleFull
          Hide
          dhecht Dan Hecht added a comment -

          Taras Bobrovytsky can this be resolved now?

          Show
          dhecht Dan Hecht added a comment - Taras Bobrovytsky can this be resolved now?
          Hide
          tarasbob Taras Bobrovytsky added a comment -
          commit b73c445eba48a4637c5642fa1866e237617a322d
          Author: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
          Date:   Thu May 11 13:54:13 2017 -0700
          
              IMPALA-5246: Verify that a UDF test completely finishes before moving on
              
              A memory intensive UDF test takes a while to completely finish and for
              the memory in Impala to be completely freed. This caused a problem in
              ASAN builds (and potentially in normal builds) because we would start
              the next test right away, before the memory is freed.
              
              We fix the issue by checking that all fragments finish executing before
              starting the next test.
              
              Testing:
              - Ran a private ASAN build which passed.
          
          Show
          tarasbob Taras Bobrovytsky added a comment - commit b73c445eba48a4637c5642fa1866e237617a322d Author: Taras Bobrovytsky <tbobrovytsky@cloudera.com> Date: Thu May 11 13:54:13 2017 -0700 IMPALA-5246: Verify that a UDF test completely finishes before moving on A memory intensive UDF test takes a while to completely finish and for the memory in Impala to be completely freed. This caused a problem in ASAN builds (and potentially in normal builds) because we would start the next test right away, before the memory is freed. We fix the issue by checking that all fragments finish executing before starting the next test. Testing: - Ran a private ASAN build which passed.

            People

            • Assignee:
              tarasbob Taras Bobrovytsky
              Reporter:
              kwho Michael Ho
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development