Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4999

Impala.tests.custom_cluster.test_spilling.TestSpillStress.test_spill_stress failed intermittently

    Details

      Description

      Stacktrace
      
      self = <test_spilling.TestSpillStress object at 0x7f7f680241d0>
      vector = <tests.common.test_vector.ImpalaTestVector object at 0x53250d0>
      
          @pytest.mark.stress
          def test_spill_stress(self, vector):
            # Number of times to execute each query
            for i in xrange(vector.get_value('iterations')):
      >       self.run_test_case('agg_stress', vector)
      
      custom_cluster/test_spilling.py:99: 
      _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
      common/impala_test_suite.py:359: in run_test_case
          result = self.__execute_query(target_impalad_client, query, user=user)
      common/impala_test_suite.py:567: in __execute_query
          return impalad_client.execute(query, user=user)
      common/impala_connection.py:160: in execute
          return self.__beeswax_client.execute(sql_stmt, user=user)
      beeswax/impala_beeswax.py:173: in execute
          handle = self.__execute_query(query_string.strip(), user=user)
      beeswax/impala_beeswax.py:339: in __execute_query
          self.wait_for_completion(handle)
      _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
      
      self = <tests.beeswax.impala_beeswax.ImpalaBeeswaxClient object at 0x3477190>
      query_handle = QueryHandle(log_context='21431d366cf751da:e62f867600000000', id='21431d366cf751da:e62f867600000000')
      
          def wait_for_completion(self, query_handle):
            """Given a query handle, polls the coordinator waiting for the query to complete"""
            while True:
              query_state = self.get_state(query_handle)
              # if the rpc succeeded, the output is the query state
              if query_state == self.query_states["FINISHED"]:
                break
              elif query_state == self.query_states["EXCEPTION"]:
                try:
                  error_log = self.__do_rpc(
                    lambda: self.imp_service.get_log(query_handle.log_context))
      >           raise ImpalaBeeswaxException("Query aborted:" + error_log, None)
      E           ImpalaBeeswaxException: ImpalaBeeswaxException:
      E            Query aborted:
      E           Memory limit exceeded
      E           The memory limit is set too low to initialize spilling operator (id=3). The minimum required memory to spill this operator is 4.25 MB.
      E           
      E           
      E           
      E           Memory Limit Exceeded by fragment: 21431d366cf751da:e62f867600000004
      E           Query(21431d366cf751da:e62f867600000000): Total=260.67 MB Peak=303.73 MB
      E             Fragment 21431d366cf751da:e62f86760000000d: Total=27.86 MB Peak=33.97 MB
      E               AGGREGATION_NODE (id=6): Total=8.00 KB Peak=8.00 KB
      E                 Exprs: Total=4.00 KB Peak=4.00 KB
      E               AGGREGATION_NODE (id=11): Total=27.82 MB Peak=32.07 MB
      E               EXCHANGE_NODE (id=10): Total=0 Peak=0
      E               DataStreamRecvr: Total=7.52 KB Peak=2.62 MB
      E               DataStreamSender (dst_id=12): Total=16.00 KB Peak=16.00 KB
      E               CodeGen: Total=5.57 KB Peak=395.50 KB
      E             Block Manager: Limit=250.00 MB Total=250.00 MB Peak=250.00 MB
      E             Fragment 21431d366cf751da:e62f86760000000a: Total=224.32 MB Peak=228.25 MB
      E               Runtime Filter Bank: Total=1.00 MB Peak=1.00 MB
      E               AGGREGATION_NODE (id=5): Total=80.46 MB Peak=80.46 MB
      E               HASH_JOIN_NODE (id=4): Total=142.73 MB Peak=149.62 MB
      E                 Hash Join Builder (join_node_id=4): Total=142.64 MB Peak=149.58 MB
      E               EXCHANGE_NODE (id=8): Total=0 Peak=0
      E               DataStreamRecvr: Total=2.84 KB Peak=23.96 MB
      E               EXCHANGE_NODE (id=9): Total=0 Peak=0
      E               DataStreamRecvr: Total=0 Peak=130.42 KB
      E               DataStreamSender (dst_id=10): Total=90.57 KB Peak=202.57 KB
      E               CodeGen: Total=25.46 KB Peak=1.53 MB
      E             Fragment 21431d366cf751da:e62f867600000004: Total=8.49 MB Peak=193.94 MB
      E               Runtime Filter Bank: Total=4.00 MB Peak=4.00 MB
      E               HASH_JOIN_NODE (id=3): Total=4.36 MB Peak=168.69 MB
      E                 Hash Join Builder (join_node_id=3): Total=4.30 MB Peak=168.55 MB
      E               HDFS_SCAN_NODE (id=2): Total=0 Peak=11.95 MB
      E               EXCHANGE_NODE (id=7): Total=0 Peak=0
      E               DataStreamRecvr: Total=0 Peak=22.20 MB
      E               DataStreamSender (dst_id=8): Total=76.53 KB Peak=188.53 KB
      E               CodeGen: Total=39.78 KB Peak=1.57 MB
      
      beeswax/impala_beeswax.py:359: ImpalaBeeswaxException
      
      -- executing against localhost:21000
      use tpch;
      
      SET disable_codegen=False;
      SET abort_on_error=1;
      SET exec_single_node_rows_threshold=0;
      SET batch_size=0;
      SET num_nodes=0;
      -- executing against localhost:21000
      set max_block_mgr_memory=250m;
      
      -- executing against localhost:21000
      
      select
        count(distinct l2.l_comment)
      from
        lineitem l1,
        lineitem l2,
        lineitem l3
      where
        l1.l_tax < 1.0
        and l2.l_tax < 1.0
        and l1.l_orderkey = l2.l_orderkey
        and l1.l_orderkey = l3.l_orderkey
        and l1.l_comment = l3.l_comment
        and l1.l_shipdate = l3.l_shipdate;
      
      -- executing against localhost:21000
      SET MAX_BLOCK_MGR_MEMORY=0;;
      

        Activity

        Hide
        tarmstrong Tim Armstrong added a comment -

        Maybe we should just delete the test. I'm not convinced it provides valuable coverage.

        Show
        tarmstrong Tim Armstrong added a comment - Maybe we should just delete the test. I'm not convinced it provides valuable coverage.
        Hide
        mikesbrown Michael Brown added a comment -

        I saw this fail in a slightly different way:

         TestSpillStress.test_spill_stress[iterations: 1 | test_id: 0 | exec_option: {'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0, 'batch_size': 0, 'num_nodes': 0} | table_format: text/none] 
        
        self = <test_spilling.TestSpillStress object at 0x59695d0>
        vector = <tests.common.test_vector.ImpalaTestVector object at 0x6661fd0>
        
            @pytest.mark.stress
            def test_spill_stress(self, vector):
              # Number of times to execute each query
              for i in xrange(vector.get_value('iterations')):
        >       self.run_test_case('agg_stress', vector)
        
        custom_cluster/test_spilling.py:99: 
        _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
        common/impala_test_suite.py:359: in run_test_case
            result = self.__execute_query(target_impalad_client, query, user=user)
        common/impala_test_suite.py:567: in __execute_query
            return impalad_client.execute(query, user=user)
        common/impala_connection.py:160: in execute
            return self.__beeswax_client.execute(sql_stmt, user=user)
        beeswax/impala_beeswax.py:173: in execute
            handle = self.__execute_query(query_string.strip(), user=user)
        beeswax/impala_beeswax.py:339: in __execute_query
            self.wait_for_completion(handle)
        _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
        
        self = <tests.beeswax.impala_beeswax.ImpalaBeeswaxClient object at 0x59503d0>
        query_handle = QueryHandle(log_context='a643ef679251bc0d:92b3014000000000', id='a643ef679251bc0d:92b3014000000000')
        
            def wait_for_completion(self, query_handle):
              """Given a query handle, polls the coordinator waiting for the query to complete"""
              while True:
                query_state = self.get_state(query_handle)
                # if the rpc succeeded, the output is the query state
                if query_state == self.query_states["FINISHED"]:
                  break
                elif query_state == self.query_states["EXCEPTION"]:
                  try:
                    error_log = self.__do_rpc(
                      lambda: self.imp_service.get_log(query_handle.log_context))
        >           raise ImpalaBeeswaxException("Query aborted:" + error_log, None)
        E           ImpalaBeeswaxException: ImpalaBeeswaxException:
        E            Query aborted:
        E           Cannot perform hash join at node with id 3. Repartitioning did not reduce the size of a spilled partition. Repartitioning level 6. Number of rows 1.
        E           
        E           
        E           
        E           Cannot perform hash join at node with id 3. Repartitioning did not reduce the size of a spilled partition. Repartitioning level 6. Number of rows 1.
        
        Show
        mikesbrown Michael Brown added a comment - I saw this fail in a slightly different way: TestSpillStress.test_spill_stress[iterations: 1 | test_id: 0 | exec_option: {'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0, 'batch_size': 0, 'num_nodes': 0} | table_format: text/none] self = <test_spilling.TestSpillStress object at 0x59695d0> vector = <tests.common.test_vector.ImpalaTestVector object at 0x6661fd0> @pytest.mark.stress def test_spill_stress(self, vector): # Number of times to execute each query for i in xrange(vector.get_value('iterations')): > self.run_test_case('agg_stress', vector) custom_cluster/test_spilling.py:99: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ common/impala_test_suite.py:359: in run_test_case result = self.__execute_query(target_impalad_client, query, user=user) common/impala_test_suite.py:567: in __execute_query return impalad_client.execute(query, user=user) common/impala_connection.py:160: in execute return self.__beeswax_client.execute(sql_stmt, user=user) beeswax/impala_beeswax.py:173: in execute handle = self.__execute_query(query_string.strip(), user=user) beeswax/impala_beeswax.py:339: in __execute_query self.wait_for_completion(handle) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ self = <tests.beeswax.impala_beeswax.ImpalaBeeswaxClient object at 0x59503d0> query_handle = QueryHandle(log_context='a643ef679251bc0d:92b3014000000000', id='a643ef679251bc0d:92b3014000000000') def wait_for_completion(self, query_handle): """Given a query handle, polls the coordinator waiting for the query to complete""" while True: query_state = self.get_state(query_handle) # if the rpc succeeded, the output is the query state if query_state == self.query_states["FINISHED"]: break elif query_state == self.query_states["EXCEPTION"]: try: error_log = self.__do_rpc( lambda: self.imp_service.get_log(query_handle.log_context)) > raise ImpalaBeeswaxException("Query aborted:" + error_log, None) E ImpalaBeeswaxException: ImpalaBeeswaxException: E Query aborted: E Cannot perform hash join at node with id 3. Repartitioning did not reduce the size of a spilled partition. Repartitioning level 6. Number of rows 1. E E E E Cannot perform hash join at node with id 3. Repartitioning did not reduce the size of a spilled partition. Repartitioning level 6. Number of rows 1.
        Hide
        mikesbrown Michael Brown added a comment -

        Tim Armstrong In light of this slightly different failure, and your comment above, please decide if we should keep this test and fix it, or just remove it.

        Show
        mikesbrown Michael Brown added a comment - Tim Armstrong In light of this slightly different failure, and your comment above, please decide if we should keep this test and fix it, or just remove it.
        Hide
        tarmstrong Tim Armstrong added a comment -

        IMPALA-4914,IMPALA-4999: remove flaky TestSpillStress

        The test does not work as intended and would likely not provide
        very good coverage of the different spilling paths, because it
        only runs a single simple query. The stress test
        (tests/stress/concurrent_select.py) provides much better coverage.
        test_mem_usage_scaling.py probably also provides better coverage
        that TestSpillStress in its current form.

        Change-Id: Ie792d64dc88f682066c13e559918d72d76b31b71
        Reviewed-on: http://gerrit.cloudera.org:8080/6437
        Reviewed-by: Michael Brown <mikeb@cloudera.com>
        Tested-by: Impala Public Jenkins

        Show
        tarmstrong Tim Armstrong added a comment - IMPALA-4914 , IMPALA-4999 : remove flaky TestSpillStress The test does not work as intended and would likely not provide very good coverage of the different spilling paths, because it only runs a single simple query. The stress test (tests/stress/concurrent_select.py) provides much better coverage. test_mem_usage_scaling.py probably also provides better coverage that TestSpillStress in its current form. Change-Id: Ie792d64dc88f682066c13e559918d72d76b31b71 Reviewed-on: http://gerrit.cloudera.org:8080/6437 Reviewed-by: Michael Brown <mikeb@cloudera.com> Tested-by: Impala Public Jenkins —

          People

          • Assignee:
            tarmstrong Tim Armstrong
            Reporter:
            kwho Michael Ho
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development