Details
-
Bug
-
Status: Resolved
-
Blocker
-
Resolution: Fixed
-
Impala 0.7
-
None
-
None
Description
Simple aggregation query using Parquet format (TPCH-Q1) on a 10TB dataset takes 35 minutes to complete (compared to 15 minutes for RC/SNAP)
Parquet ----------- HDFS_SCAN_NODE (id=0):(14m36s 57.33%) - AverageHdfsReadThreadConcurrency: 0.61 - HdfsReadThreadConcurrencyCountPercentage=0: 56.68 - HdfsReadThreadConcurrencyCountPercentage=1: 30.59 - HdfsReadThreadConcurrencyCountPercentage=10: 0.00 - HdfsReadThreadConcurrencyCountPercentage=11: 0.00 - HdfsReadThreadConcurrencyCountPercentage=12: 0.00 - HdfsReadThreadConcurrencyCountPercentage=2: 9.21 - HdfsReadThreadConcurrencyCountPercentage=3: 2.33 - HdfsReadThreadConcurrencyCountPercentage=4: 0.76 - HdfsReadThreadConcurrencyCountPercentage=5: 0.31 - HdfsReadThreadConcurrencyCountPercentage=6: 0.08 - HdfsReadThreadConcurrencyCountPercentage=7: 0.04 - HdfsReadThreadConcurrencyCountPercentage=8: 0.01 - HdfsReadThreadConcurrencyCountPercentage=9: 0.00 - AverageScannerThreadConcurrency: 1.47 - BytesRead: 97.49 GB - DecompressionTime: 8m10s - MemoryUsed: 0.00 - NumDisksAccessed: 11 - PerReadThreadRawHdfsThroughput: 109.55 MB/sec - RowsReturned: 5.92B (5915604448) - RowsReturnedRate: 6.76 M/sec - ScanRangesComplete: 934 - ScannerThreadsInvoluntaryContextSwitches: 256.99K (256986) - ScannerThreadsTotalWallClockTime: 202h50m - MaterializeTupleTime: 0ns - ScannerThreadsSysTime: 22s143ms - ScannerThreadsUserTime: 35m34s - ScannerThreadsVoluntaryContextSwitches: 202.23K (202226) - TotalRawHdfsReadTime: 15m29s - TotalReadThroughput: 66.38 MB/sec