Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
Impala 2.6.0
Description
It appears that many of the filters in TPC-DS Q53 & 89 are arriving later than before and are incorrectly sized.
Baseline
Fragment F00: Instance 7e4ec7ada127913f:c02d32b8ed7f42ca (host=vb0220.halxg.cloudera.com:22000):(Total: 1s394ms, non-child: 0.000ns, % non-child: 0.00%) Hdfs split stats (<volume id>:<# splits>/<split lengths>): 8:1/46.53 MB 9:3/372.64 MB 4:2/152.63 MB 10:4/305.54 MB 3:3/313.09 MB 1:5/347.22 MB 0:3/313.05 MB 2:1/46.43 MB 7:3/140.45 MB Filter 0 arrival: 531ms Filter 1 arrival: 531ms Filter 2 arrival: 541ms MemoryUsage(500.000ms): 0, 3.53 MB, 226.26 MB, 86.17 MB ThreadUsage(500.000ms): 1, 1, 12, 3 - AverageThreadTokens: 4.25 - BloomFilterBytes: 3.00 MB (3145728) HASH_JOIN_NODE (id=4):(Total: 283.512ms, non-child: 207.675ms, % non-child: 73.25%) - BuildPartitionTime: 6.208ms - BuildRows: 22.27K (22273) - BuildRowsPartitioned: 22.27K (22273) - BuildTime: 756.445us - GetNewBlockTime: 464.217us - HashBuckets: 57.34K (57344) - HashCollisions: 0 (0) - LargestPartitionPercent: 6 (6) - MaxPartitionLevel: 0 (0) - NumRepartitions: 0 (0) - PartitionsCreated: 16 (16) - PeakMemoryUsage: 11.19 MB (11729274) - PinTime: 0.000ns - ProbeRows: 2.75M (2746967) - ProbeRowsPartitioned: 0 (0) - ProbeTime: 188.916ms - RowsReturned: 2.75M (2746967) - RowsReturnedRate: 9.81 M/sec - SpilledPartitions: 0 (0) - UnpinTime: 3.819us EXCHANGE_NODE (id=12):(Total: 729.513us, non-child: 729.513us, % non-child: 100.00%) - BytesReceived: 926.80 KB (949048) - ConvertRowBatchTime: 591.715us - DeserializeRowBatchTimer: 3.104ms - FirstBatchArrivalWaitTime: 0.000ns - PeakMemoryUsage: 0 - RowsReturned: 22.27K (22273) - RowsReturnedRate: 33.10 M/sec - SendersBlockedTimer: 0.000ns - SendersBlockedTotalTimer(*): 0.000ns HDFS_SCAN_NODE (id=1):(Total: 75.106ms, non-child: 75.106ms, % non-child: 100.00%) - AverageHdfsReadThreadConcurrency: 0.09 - AverageScannerThreadConcurrency: 8.74 - BytesRead: 252.72 MB (265000495) - BytesReadDataNodeCache: 0 - BytesReadLocal: 252.72 MB (265000495) - BytesReadRemoteUnexpected: 0 - BytesReadShortCircuit: 252.72 MB (265000495) - DecompressionTime: 768.064ms - MaxCompressedTextFileLength: 0 - NumColumns: 3 (3) - NumDisksAccessed: 9 (9) - NumRowGroups: 24 (24) - NumScannerThreadsStarted: 11 (11) - PeakMemoryUsage: 225.63 MB (236592890) - PerReadThreadRawHdfsThroughput: 1.37 GB/sec - RemoteScanRanges: 0 (0) - RowsRead: 36.87M (36867877) - RowsReturned: 2.75M (2746967) - RowsReturnedRate: 42.93 M/sec - ScanRangesComplete: 24 (24) - ScannerThreadsInvoluntaryContextSwitches: 1.08K (1077) - ScannerThreadsTotalWallClockTime: 10s076ms - MaterializeTupleTime(*): 8s860ms - ScannerThreadsSysTime: 65.784ms - ScannerThreadsUserTime: 6s987ms - ScannerThreadsVoluntaryContextSwitches: 2.18K (2178) - TotalRawHdfsReadTime(*): 180.359ms - TotalReadThroughput: 160.33 MB/sec Filter 0: - Rows processed: 399.75K (399745) - Rows rejected: 9.43K (9428) - Rows total: 399.77K (399769) Filter 1: - Files processed: 24 (24) - Files rejected: 0 (0) - Files total: 24 (24) - RowGroups processed: 260.67K (260672) - RowGroups rejected: 0 (0) - RowGroups total: 260.67K (260672) - Rows processed: 399.75K (399745) - Rows rejected: 0 (0) - Rows total: 399.77K (399769) - Splits processed: 24 (24) - Splits rejected: 0 (0) - Splits total: 24 (24) Filter 2: - Rows processed: 36.86M (36858448) - Rows rejected: 34.11M (34111480) - Rows total: 36.86M (36858448)
Regressed build
Fragment F00: Instance f84bfa3793b2dea4:b81a13e89ceb11c2 (host=vb0222.halxg.cloudera.com:22000):(Total: 3s064ms, non-child: 0.000ns, % non-child: 0.00%) Hdfs split stats (<volume id>:<# splits>/<split lengths>): 6:3/314.84 MB 3:3/90.64 MB 5:3/314.50 MB 0:1/58.70 MB 4:1/58.56 MB 7:4/303.02 MB 10:1/58.35 MB 9:4/320.69 MB 8:5/312.42 MB 1:5/431.00 MB 2:6/268.45 MB Filter 1 arrival: 1s440ms Filter 0 arrival: 1s441ms Filter 2 arrival: 1s448ms MemoryUsage(500.000ms): 1.39 MB, 3.53 MB, 3.53 MB, 201.61 MB, 177.80 MB, 221.07 MB, 266.43 MB, 242.92 MB, 65.87 MB ThreadUsage(500.000ms): 1, 1, 1, 12, 12, 12, 12, 10, 3 - AverageThreadTokens: 7.11 - BloomFilterBytes: 12.00 KB (12288) HASH_JOIN_NODE (id=4):(Total: 1s757ms, non-child: 1s586ms, % non-child: 90.26%) - BuildPartitionTime: 6.879ms - BuildRows: 22.27K (22273) - BuildRowsPartitioned: 22.27K (22273) - BuildTime: 695.087us - GetNewBlockTime: 1.048ms - HashBuckets: 57.34K (57344) - HashCollisions: 0 (0) - LargestPartitionPercent: 6 (6) - MaxPartitionLevel: 0 (0) - NumRepartitions: 0 (0) - PartitionsCreated: 16 (16) - PeakMemoryUsage: 11.86 MB (12431900) - PinTime: 0.000ns - ProbeRows: 36.85M (36854518) - ProbeRowsPartitioned: 0 (0) - ProbeTime: 1s531ms - RowsReturned: 2.75M (2746669) - RowsReturnedRate: 1.56 M/sec - SpilledPartitions: 0 (0) - UnpinTime: 1.792us EXCHANGE_NODE (id=12):(Total: 858.319us, non-child: 858.319us, % non-child: 100.00%) - BytesReceived: 985.11 KB (1008754) - ConvertRowBatchTime: 696.263us - DeserializeRowBatchTimer: 3.330ms - FirstBatchArrivalWaitTime: 0.000ns - PeakMemoryUsage: 0 - RowsReturned: 22.27K (22273) - RowsReturnedRate: 26.92 M/sec - SendersBlockedTimer: 0.000ns - SendersBlockedTotalTimer(*): 0.000ns HDFS_SCAN_NODE (id=1):(Total: 170.345ms, non-child: 170.345ms, % non-child: 100.00%) - AverageHdfsReadThreadConcurrency: 0.87 - AverageScannerThreadConcurrency: 9.85 - BytesRead: 445.47 MB (467114068) - BytesReadDataNodeCache: 0 - BytesReadLocal: 444.88 MB (466492841) - BytesReadRemoteUnexpected: 0 - BytesReadShortCircuit: 444.88 MB (466492841) - DecompressionTime: 0.000ns - MaxCompressedTextFileLength: 0 - NumColumns: 3 (3) - NumDisksAccessed: 11 (11) - NumRowGroups: 34 (34) - NumScannerThreadsStarted: 11 (11) - PeakMemoryUsage: 259.50 MB (272107341) - PerReadThreadRawHdfsThroughput: 232.34 MB/sec - RemoteScanRanges: 6 (6) - RowsRead: 36.87M (36867877) - RowsReturned: 36.85M (36854518) - RowsReturnedRate: 217.35 M/sec - ScanRangesComplete: 34 (34) - ScannerThreadsInvoluntaryContextSwitches: 1.74K (1739) - ScannerThreadsTotalWallClockTime: 26s507ms - MaterializeTupleTime(*): 24s095ms - ScannerThreadsSysTime: 180.434ms - ScannerThreadsUserTime: 4s912ms - ScannerThreadsVoluntaryContextSwitches: 35.76K (35761) - TotalRawHdfsReadTime(*): 1s975ms - TotalReadThroughput: 116.83 MB/sec Filter 0 (4.00 KB): - Rows processed: 569.04K (569036) - Rows rejected: 13.36K (13359) - Rows total: 569.07K (569070) Filter 1 (4.00 KB): - Files processed: 34 (34) - Files rejected: 0 (0) - Files total: 34 (34) - RowGroups processed: 36.79K (36791) - RowGroups rejected: 0 (0) - RowGroups total: 36.79K (36791) - Rows processed: 569.04K (569036) - Rows rejected: 0 (0) - Rows total: 569.07K (569070) - Splits processed: 34 (34) - Splits rejected: 0 (0) - Splits total: 34 (34) Filter 2 (4.00 KB): - Rows processed: 0 (0) - Rows rejected: 0 (0) - Rows total: 0 (0)
RF001 which is on a partition column, doesn't filter any of the files which is un-expected.
Attachments
Attachments
Issue Links
- is related to
-
IMPALA-3465 TPC-H Q8 regression introduced by IMPALA-3007 for incorrectly sized bloom filter
- Resolved
-
IMPALA-3480 Add query options to set min / max filter sizes
- Resolved