Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Won't Fix
-
Impala 2.6.0
-
None
Description
The primitive_shuffle_join_union_all_with_groupby is showing a regression after IMPALA-2502, the regression appears to be happening due to serialization serialization of the branches of the UNION.
Operator #Hosts Avg Time Max Time #Rows Est. #Rows Peak Mem Est. Peak Mem Detail ------------------------------------------------------------------------------------------------------------------------------- 16:EXCHANGE 1 10.113us 10.113us 0 36.00M 0 -1.00 B UNPARTITIONED 15:AGGREGATE 15 5s502ms 6s002ms 0 36.00M 922.04 MB 604.24 MB FINALIZE 14:EXCHANGE 15 949.540ms 1s056ms 395.39M 36.00M 0 0 HASH(l_orderkey) 09:AGGREGATE 15 4s139ms 4s558ms 395.39M 36.00M 145.84 MB 604.24 MB STREAMING 00:UNION 15 4s078ms 4s489ms 398.68M 36.00M 8.00 KB 0 |--08:AGGREGATE 15 9s568ms 10s270ms 199.34M 18.00M 794.06 MB 151.06 MB FINALIZE | 07:HASH JOIN 15 14s089ms 14s888ms 774.31M 18.00M 1.13 GB 107.00 MB INNER JOIN, PARTITIONED | |--13:EXCHANGE 15 1s010ms 1s335ms 204.99M 45.00M 0 0 HASH(o2.o_orderkey) | | 06:SCAN HDFS 15 402.922ms 458.808ms 204.99M 45.00M 210.24 MB 176.00 MB tpch_300_parquet.orders o2 | 12:EXCHANGE 15 3s101ms 5s305ms 774.31M 180.00M 0 0 HASH(l2.l_orderkey) | 05:SCAN HDFS 15 690.566ms 754.985ms 774.31M 180.00M 263.63 MB 176.00 MB tpch_300_parquet.lineitem l2 04:AGGREGATE 15 9s157ms 9s725ms 199.34M 18.00M 794.06 MB 151.06 MB FINALIZE 03:HASH JOIN 15 13s297ms 14s457ms 774.31M 18.00M 1.13 GB 107.00 MB INNER JOIN, PARTITIONED |--11:EXCHANGE 15 1s057ms 1s360ms 204.99M 45.00M 0 0 HASH(o1.o_orderkey) | 02:SCAN HDFS 15 397.264ms 452.629ms 204.99M 45.00M 210.30 MB 176.00 MB tpch_300_parquet.orders o1 10:EXCHANGE 15 3s017ms 5s100ms 774.31M 180.00M 0 0 HASH(l1.l_orderkey) 01:SCAN HDFS 15 668.373ms 731.161ms 774.31M 180.00M 263.03 MB 176.00 MB tpch_300_parquet.lineitem l1
SELECT Count(*) ROWCOUNT FROM ( SELECT /* +straight_join */ l_orderkey FROM lineitem l1 INNER JOIN /* +shuffle */ orders o1 ON l1.l_orderkey = o1.o_orderkey WHERE l_shipdate < '1995-01-01' AND o1.o_orderdate < '1995-01-01' GROUP BY l_orderkey UNION ALL SELECT /* +straight_join */ l_orderkey FROM lineitem l2 INNER JOIN /* +shuffle */ orders o2 ON l2.l_orderkey = o2.o_orderkey WHERE l_shipdate < '1995-01-01' AND o2.o_orderdate < '1995-01-01' GROUP BY l_orderkey ) a GROUP BY l_orderkey HAVING ROWCOUNT = 99999999;
The profile counters show that one branch of the UNION is taking significantly more time than the other, also for some reason the profile shows significantly higher CPU time for the subquery waiting on open.
Fragment that gets blocked showing higher CPU
Averaged Fragment F04:(Total: 43s861ms, non-child: 0.000ns, % non-child: 0.00%) split sizes: min: 1.10 GB, max: 1.48 GB, avg: 1.25 GB, stddev: 107.42 MB completion times: min:43s595ms max:44s660ms mean: 44s148ms stddev:333.493ms execution rates: min:25.82 MB/sec max:34.06 MB/sec mean:29.09 MB/sec stddev:2.24 MB/sec num instances: 15 - AverageThreadTokens: 6.31 - BloomFilterBytes: 0 - PeakMemoryUsage: 181.76 MB (190585123) - PerHostPeakMemUsage: 2.47 GB (2654165942) - PrepareTime: 146.519ms - RowsProduced: 13.67M (13665799) - TotalCpuTime: 3m58s - TotalNetworkReceiveTime: 0.000ns - TotalNetworkSendTime: 39s952ms - TotalStorageWaitTime: 187.611ms CodeGen:(Total: 394.245ms, non-child: 394.245ms, % non-child: 100.00%) - CodegenTime: 0.000ns - CompileTime: 4.357ms - LoadTime: 0.000ns - ModuleBitcodeSize: 1.49 MB (1559356) - OptimizationTime: 243.888ms - PrepareTime: 143.875ms DataStreamSender (dst_id=13):(Total: 43s583ms, non-child: 43s583ms, % non-child: 100.00%) - BytesSent: 252.13 MB (264376136) - NetworkThroughput(*): 483.75 KB/sec - OverallThroughput: 5.78 MB/sec - PeakMemoryUsage: 480.00 KB (491520) - SerializeBatchTime: 2s074ms - ThriftTransmitTime(*): 8m58s - UncompressedRowBatchSize: 599.61 MB (628733578) HDFS_SCAN_NODE (id=6):(Total: 402.922ms, non-child: 402.922ms, % non-child: 100.00%) - AverageHdfsReadThreadConcurrency: 0.00 - AverageScannerThreadConcurrency: 5.36 - BytesRead: 158.43 MB (166123086) - BytesReadDataNodeCache: 0 - BytesReadLocal: 158.43 MB (166123086) - BytesReadRemoteUnexpected: 0 - BytesReadShortCircuit: 158.43 MB (166123086) - DecompressionTime: 598.367ms - MaxCompressedTextFileLength: 0 - NumColumns: 2 (2) - NumDisksAccessed: 4 (4) - NumRowGroups: 5 (5) - NumScannerThreadsStarted: 5 (5) - PeakMemoryUsage: 181.31 MB (190120925) - PerReadThreadRawHdfsThroughput: 1.17 GB/sec - RemoteScanRanges: 0 (0) - RowsRead: 30.00M (30000000) - RowsReturned: 13.67M (13665799) - RowsReturnedRate: 34.00 M/sec - ScanRangesComplete: 5 (5) - ScannerThreadsInvoluntaryContextSwitches: 1.72K (1723) - ScannerThreadsTotalWallClockTime: 3m54s - MaterializeTupleTime(*): 3m54s - ScannerThreadsSysTime: 128.177ms - ScannerThreadsUserTime: 4s464ms - ScannerThreadsVoluntaryContextSwitches: 13.27K (13268) - TotalRawHdfsReadTime(*): 141.720ms - TotalReadThroughput: 3.60 MB/sec
Fragment that gets admitted right way showing lower CPU
Averaged Fragment F01:(Total: 3s974ms, non-child: 0.000ns, % non-child: 0.00%) split sizes: min: 1.10 GB, max: 1.48 GB, avg: 1.25 GB, stddev: 107.42 MB completion times: min:3s679ms max:5s077ms mean: 4s267ms stddev:372.035ms execution rates: min:283.68 MB/sec max:319.56 MB/sec mean:301.40 MB/sec stddev:11.10 MB/sec num instances: 15 - AverageThreadTokens: 5.58 - BloomFilterBytes: 0 - PeakMemoryUsage: 181.75 MB (190583485) - PerHostPeakMemUsage: 1.42 GB (1526291688) - PrepareTime: 144.209ms - RowsProduced: 13.67M (13665799) - TotalCpuTime: 23s330ms - TotalNetworkReceiveTime: 0.000ns - TotalNetworkSendTime: 123.212ms - TotalStorageWaitTime: 260.160ms CodeGen:(Total: 398.214ms, non-child: 398.214ms, % non-child: 100.00%) - CodegenTime: 0.000ns - CompileTime: 6.726ms - LoadTime: 0.000ns - ModuleBitcodeSize: 1.49 MB (1559356) - OptimizationTime: 247.982ms - PrepareTime: 142.337ms DataStreamSender (dst_id=11):(Total: 3s698ms, non-child: 3s698ms, % non-child: 100.00%) - BytesSent: 252.22 MB (264470845) - NetworkThroughput(*): 34.68 MB/sec - OverallThroughput: 68.30 MB/sec - PeakMemoryUsage: 480.00 KB (491520) - SerializeBatchTime: 2s046ms - ThriftTransmitTime(*): 7s274ms - UncompressedRowBatchSize: 599.61 MB (628733578) HDFS_SCAN_NODE (id=2):(Total: 397.264ms, non-child: 397.264ms, % non-child: 100.00%) - AverageHdfsReadThreadConcurrency: 0.04 - AverageScannerThreadConcurrency: 5.05 - BytesRead: 158.43 MB (166123086) - BytesReadDataNodeCache: 0 - BytesReadLocal: 158.43 MB (166123086) - BytesReadRemoteUnexpected: 0 - BytesReadShortCircuit: 158.43 MB (166123086) - DecompressionTime: 579.748ms - MaxCompressedTextFileLength: 0 - NumColumns: 2 (2) - NumDisksAccessed: 4 (4) - NumRowGroups: 5 (5) - NumScannerThreadsStarted: 5 (5) - PeakMemoryUsage: 181.31 MB (190121458) - PerReadThreadRawHdfsThroughput: 1.13 GB/sec - RemoteScanRanges: 0 (0) - RowsRead: 30.00M (30000000) - RowsReturned: 13.67M (13665799) - RowsReturnedRate: 34.49 M/sec - ScanRangesComplete: 5 (5) - ScannerThreadsInvoluntaryContextSwitches: 1.73K (1732) - ScannerThreadsTotalWallClockTime: 19s447ms - MaterializeTupleTime(*): 18s475ms - ScannerThreadsSysTime: 124.778ms - ScannerThreadsUserTime: 4s347ms - ScannerThreadsVoluntaryContextSwitches: 13.18K (13184) - TotalRawHdfsReadTime(*): 142.595ms - TotalReadThroughput: 39.41 MB/sec