Details
-
Improvement
-
Status: Resolved
-
Blocker
-
Resolution: Fixed
-
Impala 0.6
-
None
-
None
Description
Impala's HBase scan is very slow. For scanning 40000 rows from an colocated HBase table, it took ~5.7sec.
The query is: select count from (select * from hbasetbl limit 40000);
Majority of the time is spent inside HBaseTableScanner::Next. I've done some experiment to figure out where we spent the time by adding a few more timers. Here's the runtime profile:
HBASE_SCAN_NODE (id=0):(5s714ms 99.98%)
- BytesRead: 4.26 MB
- HBaseTableScanner.ScanSetup: 483.594us
- HBaseTableScanner::ResultScanner_next: 5s710ms
- MemoryUsed: 0.00
- MyOwnTimer1: 1s387ms <-- We should trim this time.
- MyOwnTimer2: 2s798ms <--
- MyOwnTimer3: 688.179ms
- MyOwnTimer4: 0ns
- MyOwnTimer5: 0ns
- NumDisksAccessed: 0
- PerReadThreadRawHdfsThroughput: 5.14 MB/sec
- RowsReturned: 40.00K (40000)
- RowsReturnedRate: 7.00 K/sec
- ScanRangesComplete: 0
- ScannerThreadsInvoluntaryContextSwitches: 0
- ScannerThreadsTotalWallClockTime: 0ns
- MaterializeTupleTime: 0ns
- ScannerThreadsSysTime: 0ns
- ScannerThreadsUserTime: 0ns
- ScannerThreadsVoluntaryContextSwitches: 0
- TotalRawHdfsReadTime: 827.604ms <-- we spent only 800ms on fetching from HBase
- TotalReadThroughput: 775.05 KB/sec
I've attached the code with more timers in the attached file HBaseTAbleScanner.Next
When I increase the limit to 80000, the perf is much worse. Here's the profile:
HBASE_SCAN_NODE (id=0):(23s027ms 99.99%)
- BytesRead: 8.51 MB
- HBaseTableScanner.ScanSetup: 249.40us
- HBaseTableScanner::ResultScanner_next: 23s018ms
- MemoryUsed: 0.00
- MyOwnTimer1: 5s680ms
- MyOwnTimer2: 11s401ms
- MyOwnTimer3: 2s829ms
- MyOwnTimer4: 0ns
- MyOwnTimer5: 0ns
- NumDisksAccessed: 0
- PerReadThreadRawHdfsThroughput: 2.76 MB/sec
- RowsReturned: 80.00K (80000)
- RowsReturnedRate: 3.47 K/sec
- ScanRangesComplete: 0
- ScannerThreadsInvoluntaryContextSwitches: 0
- ScannerThreadsTotalWallClockTime: 0ns
- MaterializeTupleTime: 0ns
- ScannerThreadsSysTime: 0ns
- ScannerThreadsUserTime: 0ns
- ScannerThreadsVoluntaryContextSwitches: 0
- TotalRawHdfsReadTime: 3s085ms
- TotalReadThroughput: 376.52 KB/sec
I didn't see any disk activity. So, I dont' think HBase is going to disk. Also, the read time is only 3sec. The big chuck of time is spent within the block of MyOwnTimer1 and MyOwnTimer2. We spent 5x more time there even though we only scan 2x more rows.