Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-6701

stress test compute stats binary search can't find a start point

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • Impala 2.8.0, Impala 2.9.0, Impala 2.10.0, Impala 2.11.0, Impala 3.0, Impala 2.12.0
    • None
    • Infrastructure
    • None

    Description

      The stress test compute stats statements recently took 9 hours to do a binary search.

      The stress test cannot find a start point for mem_limit for compute stats statements, because explain is not supported.

      [localhost:21000] > explain compute stats tpch.lineitem;
      Query: explain compute stats tpch.lineitem
      ERROR: AnalysisException: Syntax error in line 1:
      explain compute stats tpch.lineitem
              ^
      Encountered: COMPUTE
      Expected: CREATE, DELETE, INSERT, SELECT, UPDATE, UPSERT, VALUES, WITH
      
      CAUSED BY: Exception: Syntax error
      
      [localhost:21000] >
      

      The stress test has done this ever since it supported such:

      1370 def estimate_query_mem_mb_usage(query, query_runner):
      1371   """Runs an explain plan then extracts and returns the estimated memory needed to run
      1372   the query.
      1373   """
      1374   with query_runner.impalad_conn.cursor() as cursor:
      1375     LOG.debug("Using %s database", query.db_name)
      1376     if query.db_name:
      1377       cursor.execute('USE ' + query.db_name)
      1378     if query.query_type == QueryType.COMPUTE_STATS:
      1379       # Running "explain" on compute stats is not supported by Impala.
      1380       return
      

      This means the stress test is starting with the full limit of impalad.

      2018-03-17 08:00:38,684 12313 MainThread INFO:concurrent_select[1164]:Collecting runtime info for query compute_stats_call_center_mt_dop_1: 
      COMPUTE STATS call_center
      2018-03-17 08:00:38,925 12313 MainThread DEBUG:concurrent_select[1375]:Using tpcds_300_decimal_parquet database
      2018-03-17 08:00:38,925 12313 MainThread DEBUG:db_connection[203]:IMPALA: USE tpcds_300_decimal_parquet
      2018-03-17 08:00:39,007 12313 MainThread INFO:hiveserver2[265]:Closing active operation
      2018-03-17 08:00:39,123 12313 MainThread INFO:concurrent_select[1247]:Finding a starting point for binary search
      2018-03-17 08:00:39,148 12313 MainThread DEBUG:concurrent_select[866]:Using tpcds_300_decimal_parquet database
      2018-03-17 08:00:39,148 12313 MainThread DEBUG:db_connection[203]:IMPALA: USE tpcds_300_decimal_parquet
      2018-03-17 08:00:39,206 12313 MainThread DEBUG:db_connection[203]:IMPALA: SET MT_DOP=1
      2018-03-17 08:00:39,333 12313 MainThread DEBUG:db_connection[203]:IMPALA: SET ABORT_ON_ERROR=1
      2018-03-17 08:00:39,416 12313 MainThread DEBUG:concurrent_select[878]:Setting mem limit to 77308 MB
      2018-03-17 08:00:39,416 12313 MainThread DEBUG:db_connection[203]:IMPALA: SET MEM_LIMIT=77308M
      2018-03-17 08:00:39,503 12313 MainThread DEBUG:concurrent_select[882]:Running query with 77308 MB mem limit at vc0718.halxg.cloudera.com with timeout secs 9223372036854775807:
      COMPUTE STATS call_center
      2018-03-17 08:00:39,741 12313 MainThread DEBUG:concurrent_select[890]:Query id is 3b4213033bf2359c:d44b29c500000000
      2018-03-17 08:00:41,084 12313 MainThread INFO:hiveserver2[265]:Closing active operation
      2018-03-17 08:00:41,202 12313 MainThread DEBUG:concurrent_select[1209]:Spilled: False
      2018-03-17 08:00:41,202 12313 MainThread INFO:concurrent_select[1267]:Finding minimum memory required to avoid spilling
      2018-03-17 08:00:41,227 12313 MainThread DEBUG:concurrent_select[866]:Using tpcds_300_decimal_parquet database
      2018-03-17 08:00:41,227 12313 MainThread DEBUG:db_connection[203]:IMPALA: USE tpcds_300_decimal_parquet
      2018-03-17 08:00:41,286 12313 MainThread DEBUG:db_connection[203]:IMPALA: SET MT_DOP=1
      2018-03-17 08:00:41,367 12313 MainThread DEBUG:db_connection[203]:IMPALA: SET ABORT_ON_ERROR=1
      2018-03-17 08:00:41,449 12313 MainThread DEBUG:concurrent_select[878]:Setting mem limit to 38654 MB
      2018-03-17 08:00:41,449 12313 MainThread DEBUG:db_connection[203]:IMPALA: SET MEM_LIMIT=38654M
      2018-03-17 08:00:41,530 12313 MainThread DEBUG:concurrent_select[882]:Running query with 38654 MB mem limit at vc0718.halxg.cloudera.com with timeout secs 9223372036854775807:
      COMPUTE STATS call_center
      2018-03-17 08:00:41,589 12313 MainThread DEBUG:concurrent_select[890]:Query id is 74db40c3f221cf3:d67997c00000000
      2018-03-17 08:00:42,184 12313 MainThread INFO:hiveserver2[265]:Closing active operation
      

      This has always been the case, but no one really looked into it until now.

      It's important to get this fixed soon as we increase where our stress tests run. Before, it was a very infrequent cost, but at least in my downstream environment, that is rapidly changing.

      Attachments

        Activity

          People

            Unassigned Unassigned
            mikeb Michael Brown
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: