Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-1621

Reduce minimum memory requirement for TPC-H Q9 (1.29GB)

    XMLWordPrintableJSON

    Details

      Description

      In order to run TPC-H Q9, even in the small dataset, we need to set mem_limit at least 1.29GB. At 1.28GB it fails with 'Memory Limit Exceeded' error. That is too much.

      [localhost:21000] > set mem_limit=1280m;
      MEM_LIMIT set to 1280m
      [localhost:21000] > select              
      nation,
      o_year,
      sum(amount) as sum_profit
      from(
      select
      n_name as nation,
      year(o_orderdate) as o_year,
      l_extendedprice * (1 - l_discount) - ps_supplycost * l_quantity as amount
      from
      part,
      supplier,
      lineitem,
      partsupp,
      orders,
      nation
      where
      s_suppkey = l_suppkey
      and ps_suppkey = l_suppkey
      and ps_partkey = l_partkey
      and p_partkey = l_partkey
      and o_orderkey = l_orderkey
      and s_nationkey = n_nationkey
      and p_name like '%green%'
      ) as profit
      group by
      nation,
      o_year
      order by
      nation,
      o_year desc;
      WARNINGS: Memory limit exceeded
      Query did not have enough memory to get the minimum required buffers in the block manager.
      
      Backend 3:Memory Limit Exceeded
      Query(ea48ecbd6321c97b:48c769bc42a1c18c) Limit: Limit=1.25 GB Consumption=1.13 GB
        Fragment ea48ecbd6321c97b:48c769bc42a1c18e: Consumption=6.26 MB
          SORT_NODE (id=12): Consumption=0
          AGGREGATION_NODE (id=19): Consumption=6.25 MB
          EXCHANGE_NODE (id=18): Consumption=0
          DataStreamRecvr: Consumption=0
          DataStreamSender: Consumption=4.00 KB
        Block Manager: Limit=1024.00 MB Consumption=1023.50 MB
        Fragment ea48ecbd6321c97b:48c769bc42a1c191: Consumption=994.57 MB
          AGGREGATION_NODE (id=11): Consumption=6.25 MB
          HASH_JOIN_NODE (id=10): Consumption=393.04 MB
          HASH_JOIN_NODE (id=9): Consumption=2.03 MB
          HASH_JOIN_NODE (id=8): Consumption=524.02 MB
          HASH_JOIN_NODE (id=7): Consumption=5.46 MB
          HASH_JOIN_NODE (id=6): Consumption=4.32 MB
          HDFS_SCAN_NODE (id=2): Consumption=32.86 MB
          EXCHANGE_NODE (id=13): Consumption=0
          DataStreamRecvr: Consumption=0
          EXCHANGE_NODE (id=14): Consumption=0
          DataStreamRecvr: Consumption=0
          EXCHANGE_NODE (id=15): Consumption=0
          DataStreamRecvr: Consumption=26.57 MB
          EXCHANGE_NODE (id=16): Consumption=0
          DataStreamRecvr: Consumption=0
          EXCHANGE_NODE (id=17): Consumption=0
          DataStreamRecvr: Consumption=0
          DataStreamSender: Consumption=12.00 KB
      Backend 4:Memory Limit Exceeded
      Query(ea48ecbd6321c97b:48c769bc42a1c18c) Limit: Limit=1.25 GB Consumption=1.10 GB
        Fragment ea48ecbd6321c97b:48c769bc42a1c18f: Consumption=6.26 MB
          SORT_NODE (id=12): Consumption=0
          AGGREGATION_NODE (id=19): Consumption=6.25 MB
          EXCHANGE_NODE (id=18): Consumption=0
          DataStreamRecvr: Consumption=0
          DataStreamSender: Consumption=4.00 KB
        Block Manager: Limit=1024.00 MB Consumption=1017.00 MB
        Fragment ea48ecbd6321c97b:48c769bc42a1c192: Consumption=1.10 GB
          AGGREGATION_NODE (id=11): Consumption=6.25 MB
          HASH_JOIN_NODE (id=10): Consumption=524.04 MB
          HASH_JOIN_NODE (id=9): Consumption=2.03 MB
          HASH_JOIN_NODE (id=8): Consumption=532.02 MB
          HASH_JOIN_NODE (id=7): Consumption=5.53 MB
          HASH_JOIN_NODE (id=6): Consumption=4.32 MB
          HDFS_SCAN_NODE (id=2): Consumption=38.86 MB
          EXCHANGE_NODE (id=13): Consumption=0
          DataStreamRecvr: Consumption=0
          EXCHANGE_NODE (id=14): Consumption=0
          DataStreamRecvr: Consumption=0
          EXCHANGE_NODE (id=15): Consumption=0
          DataStreamRecvr: Consumption=0
          EXCHANGE_NODE (id=16): Consumption=0
          DataStreamRecvr: Consumption=0
          EXCHANGE_NODE (id=17): Consumption=0
          DataStreamRecvr: Consumption=8.41 MB
          DataStreamSender: Consumption=12.00 KB
      Backend 5:Memory Limit Exceeded
      Query(ea48ecbd6321c97b:48c769bc42a1c18c) Limit: Limit=1.25 GB Consumption=1.11 GB
        Fragment ea48ecbd6321c97b:48c769bc42a1c18d: Consumption=8.00 KB
          EXCHANGE_NODE (id=20): Consumption=0
          DataStreamRecvr: Consumption=0
        Block Manager: Limit=1024.00 MB Consumption=1017.00 MB
        Fragment ea48ecbd6321c97b:48c769bc42a1c190: Consumption=6.26 MB
          SORT_NODE (id=12): Consumption=0
          AGGREGATION_NODE (id=19): Consumption=6.25 MB
          EXCHANGE_NODE (id=18): Consumption=0
          DataStreamRecvr: Consumption=0
          DataStreamSender: Consumption=4.00 KB
        Fragment ea48ecbd6321c97b:48c769bc42a1c193: Consumption=1.10 GB
          AGGREGATION_NODE (id=11): Consumption=6.25 MB
          HASH_JOIN_NODE (id=10): Consumption=524.04 MB
          HASH_JOIN_NODE (id=9): Consumption=2.03 MB
          HASH_JOIN_NODE (id=8): Consumption=532.02 MB
          HASH_JOIN_NODE (id=7): Consumption=5.53 MB
          HASH_JOIN_NODE (id=6): Consumption=4.32 MB
          HDFS_SCAN_NODE (id=2): Consumption=38.86 MB
          EXCHANGE_NODE (id=13): Consumption=0
          DataStreamRecvr: Consumption=0
          EXCHANGE_NODE (id=14): Consumption=0
          DataStreamRecvr: Consumption=0
          EXCHANGE_NODE (id=15): Consumption=0
          DataStreamRecvr: Consumption=0
          EXCHANGE_NODE (id=16): Consumption=0
          DataStreamRecvr: Consumption=0
          EXCHANGE_NODE (id=17): Consumption=0
          DataStreamRecvr: Consumption=8.41 MB
          DataStreamSender: Consumption=12.00 KB
        Fragment ea48ecbd6321c97b:48c769bc42a1c197: Consumption=12.62 MB
          HDFS_SCAN_NODE (id=4): Consumption=12.56 MB
          DataStreamSender: Consumption=12.00 KB
      
      [localhost:21000] > set mem_limit=1290m;
      MEM_LIMIT set to 1290m
      [localhost:21000] > select  
      ...
      Fetched 175 row(s) in 6.29s
      

      Below is the summary of a successful run, when mem_limit is set to 1.29GB. Two inner joins (10, 08) consume around 550MB each.

      [localhost:21000] > summary;
      +---------------------+--------+----------+----------+---------+------------+-----------+---------------+-----------------------+
      | Operator            | #Hosts | Avg Time | Max Time | #Rows   | Est. #Rows | Peak Mem  | Est. Peak Mem | Detail                |
      +---------------------+--------+----------+----------+---------+------------+-----------+---------------+-----------------------+
      | 20:MERGING-EXCHANGE | 1      | 266.11us | 266.11us | 175     | 52.27K     | 0 B       | -1 B          | UNPARTITIONED         |
      | 12:SORT             | 3      | 389.89us | 459.90us | 175     | 52.27K     | 24.02 MB  | 16.00 MB      |                       |
      | 19:AGGREGATE        | 3      | 116.90ms | 117.06ms | 175     | 52.27K     | 6.27 MB   | 10.00 MB      | FINALIZE              |
      | 18:EXCHANGE         | 3      | 69.31us  | 71.95us  | 525     | 52.27K     | 0 B       | 0 B           | HASH(nation,o_year)   |
      | 11:AGGREGATE        | 3      | 319.73ms | 339.09ms | 525     | 52.27K     | 15.76 MB  | 10.00 MB      |                       |
      | 10:HASH JOIN        | 3      | 1.20s    | 1.20s    | 319.40K | 17.60M     | 561.87 MB | 20.14 MB      | INNER JOIN, BROADCAST |
      | |--17:EXCHANGE      | 3      | 98.45ms  | 111.44ms | 2.40M   | 800.00K    | 0 B       | 0 B           | BROADCAST             |
      | |  03:SCAN HDFS     | 1      | 104.51ms | 104.51ms | 800.00K | 800.00K    | 10.81 MB  | 168.00 MB     | tpch_parquet.partsupp |
      | 09:HASH JOIN        | 3      | 27.24ms  | 30.16ms  | 319.40K | 6.00M      | 13.77 MB  | 690 B         | INNER JOIN, BROADCAST |
      | |--16:EXCHANGE      | 3      | 16.15us  | 17.77us  | 75      | 25         | 0 B       | 0 B           | BROADCAST             |
      | |  05:SCAN HDFS     | 1      | 19.81ms  | 19.81ms  | 25      | 25         | 50.00 KB  | 32.00 MB      | tpch_parquet.nation   |
      | 08:HASH JOIN        | 3      | 2.62s    | 2.95s    | 319.40K | 6.00M      | 548.03 MB | 53.50 MB      | INNER JOIN, BROADCAST |
      | |--15:EXCHANGE      | 3      | 293.75ms | 306.81ms | 4.50M   | 1.50M      | 0 B       | 0 B           | BROADCAST             |
      | |  04:SCAN HDFS     | 2      | 224.56ms | 355.46ms | 1.50M   | 1.50M      | 12.60 MB  | 80.00 MB      | tpch_parquet.orders   |
      | 07:HASH JOIN        | 3      | 58.47ms  | 59.47ms  | 319.40K | 6.00M      | 8.33 MB   | 107.42 KB     | INNER JOIN, BROADCAST |
      | |--14:EXCHANGE      | 3      | 1.34ms   | 1.62ms   | 30.00K  | 10.00K     | 0 B       | 0 B           | BROADCAST             |
      | |  01:SCAN HDFS     | 1      | 14.65ms  | 14.65ms  | 10.00K  | 10.00K     | 401.63 KB | 32.00 MB      | tpch_parquet.supplier |
      | 06:HASH JOIN        | 3      | 251.86ms | 254.41ms | 319.40K | 6.00M      | 6.70 MB   | 1.19 MB       | INNER JOIN, BROADCAST |
      | |--13:EXCHANGE      | 3      | 2.63ms   | 4.68ms   | 31.99K  | 20.00K     | 0 B       | 0 B           | BROADCAST             |
      | |  00:SCAN HDFS     | 1      | 302.87ms | 302.87ms | 10.66K  | 20.00K     | 7.75 MB   | 32.00 MB      | tpch_parquet.part     |
      | 02:SCAN HDFS        | 3      | 1.34s    | 1.54s    | 5.59M   | 6.00M      | 38.86 MB  | 480.00 MB     | tpch_parquet.lineitem |
      +---------------------+--------+----------+----------+---------+------------+-----------+---------------+-----------------------+
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                ippokratis Ippokratis Pandis
                Reporter:
                ippokratis Ippokratis Pandis
              • Votes:
                0 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: