Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-2470

TPC-H Q13 performance regression for Impala 2.3 Vs. 2.2

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • Impala 2.3.0
    • Impala 2.3.0
    • None

    Description

      I compared performance for Impala 2.3 Vs 2.2 for TPC-H 300GB Parquet and several queries are showing sizable regressions.
      Both queries were run against the same database, will be looking into the query profiles next but I am not sure if this will tell us much as we are likely to need Hot function info.

      Query impalad version 2.2.0-cdh5 impalad version 2.3.0-cdh5 (linear) Regression %
      1 176 224 -26.82%
      2 34 34 0.66%
      3 72 75 -3.55%
      4 78 76 2.85%
      5 81 76 6.21%
      6 5 6 -16.10%
      7 124 115 7.44%
      8 102 100 1.44%
      9 322 313 2.74%
      10 29 30 -2.70%
      11 12 13 -6.05%
      12 18 23 -32.45%
      13 77 144 -86.09%
      14 15 16 -4.35%
      15 15 15 1.68%
      16 20 19 2.70%
      17 257 230 11.74%
      18 206 166 24.33%
      19 375 412 -9.95%
      20 50 41 19.38%
      21 212 202 4.62%
      22 19 19 -1.58%

      The 10 node stress cluster also shows the old code is faster ~60s vs ~80s. Profiles are attached. I took a quick look, the problem seems to be the 09 agg node

      2.2:

            AGGREGATION_NODE (id=9):(Total: 1m1s, non-child: 1s141ms, % non-child: 1.85%)
               - BuildTime: 760.416ms
               - GetNewBlockTime: 118.79us
               - GetResultsTime: 260.596ms
               - HashBuckets: 4.19M (4194304)
               - LargestPartitionPercent: 6 (6)
               - MaxPartitionLevel: 1 (1)
               - NumRepartitions: 1 (1)
               - PartitionsCreated: 32 (32)
               - PeakMemoryUsage: 416.99 MB (437243724)
               - PinTime: 2.66us
               - RowsRepartitioned: 281.25K (281250)
               - RowsReturned: 4.50M (4500000)
               - RowsReturnedRate: 72.77 K/sec
               - SpilledPartitions: 1 (1)
               - UnpinTime: 14.163us
      

      2.3:

            AGGREGATION_NODE (id=9):(Total: 1m18s, non-child: 37s147ms, % non-child: 47.23%)
               - BuildTime: 36s921ms
               - GetNewBlockTime: 184.422us
               - GetResultsTime: 211.389ms
               - HTResizeTime: 187.135ms
               - HashBuckets: 8.39M (8388608)
               - LargestPartitionPercent: 6 (6)
               - MaxPartitionLevel: 0 (0)
               - NumRepartitions: 0 (0)
               - PartitionsCreated: 16 (16)
               - PeakMemoryUsage: 271.03 MB (284198476)
               - PinTime: 0ns
               - RowsRepartitioned: 0 (0)
               - RowsReturned: 4.50M (4500000)
               - RowsReturnedRate: 57.22 K/sec
               - SpilledPartitions: 0 (0)
               - UnpinTime: 978ns
      

      With linear probing the final aggregate has massively long probes, which is causing the slow down compared to quadratic probing or Impala 2.2

      It appears that the data distribution + hash function is causing longer than usual hash chains for linear probing.

      I don't believe that nulls are playing a role here as c_custkey is non-nullable column.

      Logs from linear probing

      I1001 22:07:38.215006 12550 hash-table.cc:264] Buckets: 524288 282011 0.537893
      Duplicates: 0 buckets 0 nodes
      Probes: 564022
      FailedProbes: 0
      Travel: 654607897 1160.61
      HashCollisions: 0 0
      Resizes: 9
      
      I1001 22:07:38.242862 12550 hash-table.cc:264] Buckets: 524288 281865 0.537615
      Duplicates: 0 buckets 0 nodes
      Probes: 563730
      FailedProbes: 0
      Travel: 635747352 1127.75
      HashCollisions: 0 0
      Resizes: 9
      

      For quadratic probing

      I1001 15:34:46.114688 11810 hash-table.cc:264] Buckets: 524288 281954 0.537785
      Duplicates: 0 buckets 0 nodes
      Probes: 563908
      FailedProbes: 0
      Travel: 11369004 20.1611
      HashCollisions: 0 0
      Resizes: 9
      
      I1001 15:34:46.140525 11810 hash-table.cc:264] Buckets: 524288 280655 0.535307
      Duplicates: 0 buckets 0 nodes
      Probes: 561310
      FailedProbes: 0
      Travel: 10722230 19.1022
      HashCollisions: 0 0
      Resizes: 9
      

      Simplified query used for the repro

      select count(*) from (
      select
      c_custkey
      from
      customer left outer join orders on (
      c_custkey = o_custkey
      and o_comment not like '%special%requests%')
      group by c_custkey) a
      
      Operator Hosts Avg Time Max Time Rows Est.Rows Peak Mem Est. Peak Mem Detail
      10:AGGREGATE 1 252.97ms 252.97ms 1 1 48.00 KB -1 B FINALIZE
      09:EXCHANGE 1 385.33us 385.33us 10 1 0 B -1 B UNPARTITIONED
      04:AGGREGATE 10 258.81ms 262.17ms 10 1 12.00 KB 10.00 MB  
      08:AGGREGATE 10 36.87s 37.96s 45.00M 53.09M 271.02 MB 44.55 MB FINALIZE
      07:EXCHANGE 10 203.05ms 303.85ms 45.00M 53.09M 0 B 0 B HASH(c_custkey)
      03:AGGREGATE 10 13.11s 13.78s 45.00M 53.09M 271.02 MB 445.52 MB  
      02:HASH JOIN 10 19.70s 21.34s 460.15M 405.00M 267.02 MB 37.77 MB RIGHT OUTER JOIN, PARTITIONED
      --06:EXCHANGE 10 216.79ms 256.41ms 45.00M 45.00M 0 B 0 B HASH(c_custkey)
      --00:SCAN HDFS 10 51.85ms 79.44ms 45.00M 45.00M 38.12 MB 88.00 MB tpch_300_parquet.customer
      05:EXCHANGE 10 4.33s 4.98s 445.15M 405.00M 0 B 0 B HASH(o_custkey)
      01:SCAN HDFS 10 349.88ms 407.81ms 445.15M 405.00M 660.73 MB 176.00 MB tpch_300_parquet.orders

      Attachments

        1. quadratic-1p3m-summary-profile.txt
          54 kB
          Jim Apple
        2. revert-open-addressing-1p3m-summary-profile.txt
          53 kB
          Jim Apple
        3. HashClustering.png
          42 kB
          Mostafa Mokhtar
        4. hash_fn.patch
          4 kB
          Jim Apple
        5. 2.2-Q13.txt
          235 kB
          Mostafa Mokhtar
        6. 2.3-Q13.txt
          220 kB
          Mostafa Mokhtar

        Issue Links

          Activity

            People

              mmokhtar Mostafa Mokhtar
              mmokhtar Mostafa Mokhtar
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: