Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4388

OOM in test_insert where none was before

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: Impala 2.8.0
    • Fix Version/s: Impala 2.8.0
    • Component/s: Backend
    • Labels:

      Description

      Between 838c1f54428eec357bc21d27b302e25e125928c6 and c24e9da914e1d5e5dabd1bded5a78452bccff9b5:

      09:18:06.229  TestInsertQueries.test_insert[compression_codec: snappy | exec_option: {'batch_size': 0, 'num_nodes': 0, 'sync_ddl': 1, 'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: parquet/none] 
      09:18:06.229 query_test/test_insert.py:111: in test_insert
      09:18:06.229     multiple_impalad=vector.get_value('exec_option')['sync_ddl'] == 1)
      09:18:06.229 common/impala_test_suite.py:320: in run_test_case
      09:18:06.229     result = self.__execute_query(target_impalad_client, query, user=user)
      09:18:06.229 common/impala_test_suite.py:511: in __execute_query
      09:18:06.229     return impalad_client.execute(query, user=user)
      09:18:06.229 common/impala_connection.py:160: in execute
      09:18:06.229     return self.__beeswax_client.execute(sql_stmt, user=user)
      09:18:06.229 beeswax/impala_beeswax.py:173: in execute
      09:18:06.229     handle = self.__execute_query(query_string.strip(), user=user)
      09:18:06.229 beeswax/impala_beeswax.py:339: in __execute_query
      09:18:06.229     self.wait_for_completion(handle)
      09:18:06.229 beeswax/impala_beeswax.py:359: in wait_for_completion
      09:18:06.229     raise ImpalaBeeswaxException("Query aborted:" + error_log, None)
      09:18:06.229 E   ImpalaBeeswaxException: ImpalaBeeswaxException:
      09:18:06.229 E    Query aborted:
      09:18:06.229 E   Memory limit exceeded
      09:18:06.229 E   Query did not have enough memory to get the minimum required buffers in the block manager.
      09:18:06.229 E   
      09:18:06.229 E   
      09:18:06.229 E   
      09:18:06.229 E   Memory Limit Exceeded by fragment: ad4a399d1e83038f:9f9365da00000000
      09:18:06.229 E   Query(ad4a399d1e83038f:9f9365da00000000): Limit=64.00 MB Total=4.20 KB Peak=8.00 MB
      09:18:06.229 E     Fragment ad4a399d1e83038f:9f9365da00000000: Total=4.20 KB Peak=4.20 KB
      09:18:06.229 E       SORT_NODE (id=2): Total=0 Peak=0
      09:18:06.229 E       EXCHANGE_NODE (id=1): Total=0 Peak=0
      09:18:06.229 E       DataStreamRecvr: Total=4.20 KB Peak=4.20 KB
      09:18:06.229 E     Block Manager: Total=0 Peak=0
      09:18:06.229 E   Memory Limit Exceeded by fragment: ad4a399d1e83038f:9f9365da00000001
      09:18:06.229 E   Query(ad4a399d1e83038f:9f9365da00000000): Limit=64.00 MB Total=8.39 KB Peak=8.00 MB
      09:18:06.229 E     Fragment ad4a399d1e83038f:9f9365da00000001: Total=8.39 KB Peak=8.39 KB
      09:18:06.229 E       SORT_NODE (id=2): Total=0 Peak=0
      09:18:06.229 E       EXCHANGE_NODE (id=1): Total=0 Peak=0
      09:18:06.229 E       DataStreamRecvr: Total=8.39 KB Peak=8.39 KB
      09:18:06.229 E     Block Manager: Total=0 Peak=0
      09:18:06.229 E   Memory Limit Exceeded by fragment: ad4a399d1e83038f:9f9365da00000002
      09:18:06.229 E   Query(ad4a399d1e83038f:9f9365da00000000): Limit=64.00 MB Total=4.20 KB Peak=8.00 MB
      09:18:06.229 E     Fragment ad4a399d1e83038f:9f9365da00000002: Total=4.20 KB Peak=4.20 KB
      09:18:06.229 E       SORT_NODE (id=2): Total=0 Peak=0
      09:18:06.229 E       EXCHANGE_NODE (id=1): Total=0 Peak=0
      09:18:06.229 E       DataStreamRecvr: Total=4.20 KB Peak=4.20 KB
      09:18:06.229 E     Block Manager: Total=0 Peak=0
      

      Could this be ba026f27267204610cf238f24cd219fe297dbe96? This is testing in release mode with the exhaustive exploration strategy.

        Activity

        Hide
        tarmstrong Tim Armstrong added a comment -

        This is probably a result of different concurrent queries competing for memory.

        Show
        tarmstrong Tim Armstrong added a comment - This is probably a result of different concurrent queries competing for memory.
        Hide
        lv Lars Volker added a comment -

        I suspect this is due to this change: https://gerrit.cloudera.org/#/c/4745/

        It added a new test at the end of insert.test and it looks like the set mem_limit=64m; of the test before does not get reverted automatically. The clustered hint adds a new sort node after the exchange, so that looks consistent with the memory limit exceeded error message.

        I will have a look why the set mem_limit=64m; is not reverted and fix either this, or add a reverting command after the offending test.

        Show
        lv Lars Volker added a comment - I suspect this is due to this change: https://gerrit.cloudera.org/#/c/4745/ It added a new test at the end of insert.test and it looks like the set mem_limit=64m; of the test before does not get reverted automatically. The clustered hint adds a new sort node after the exchange, so that looks consistent with the memory limit exceeded error message. I will have a look why the set mem_limit=64m; is not reverted and fix either this, or add a reverting command after the offending test.
        Hide
        lv Lars Volker added a comment -

        IMPALA-4388: Fix query option reset in tests

        Before this change, using sync_ddl=1 could prevent query options from
        being reset correctly. The test execution would use a connection to a
        random impalad and execute the test against it, but then undo all
        changes to the query options on the default connection (instead of the
        one used for the test).

        The fix is to undo the changes on the correct connection.

        Change-Id: I82e97438ee9f4f75907704653faa884722213f5d
        Reviewed-on: http://gerrit.cloudera.org:8080/4870
        Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
        Tested-by: Internal Jenkins

        Show
        lv Lars Volker added a comment - IMPALA-4388 : Fix query option reset in tests Before this change, using sync_ddl=1 could prevent query options from being reset correctly. The test execution would use a connection to a random impalad and execute the test against it, but then undo all changes to the query options on the default connection (instead of the one used for the test). The fix is to undo the changes on the correct connection. Change-Id: I82e97438ee9f4f75907704653faa884722213f5d Reviewed-on: http://gerrit.cloudera.org:8080/4870 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins

          People

          • Assignee:
            lv Lars Volker
            Reporter:
            jbapple Jim Apple
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development