[IMPALA-2797] scanner threads can act like a thundering herd - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Won't Fix
Affects Version/s: Impala 2.3.0
Fix Version/s: None
Component/s: Backend
Labels:
None

Target Version:

Product Backlog

Description

I noticed this issue with the Kudu scan node implementation, but I imagine it could happen with HDFS as well:

on a big box, we started up 48 scanner threads for a 'SELECT COUNT' query
the underlying batches that are read from Kudu return a few million rows each (because it's trying to do large IOs to amortize round-trips, and the projection is empty for COUNT)
- the scannerthread chops these Kudu batches into RowBatches of 1000 rows each and pushes those onto the RowBatchQueue

Because each backend IO (scan RPC to Kudu) results in thousands of Impala RowBatches, we end up with the main thread pulling "round robin" from all of the scanner threads, rather than exhausting one Kudu batch before moving to the next. The issue here is that we see the following:

when the query starts, 48 threads hammer the kudu server with Scan RPCs
the Kudu server is then completely quiet for ~30 seconds while they drain their buffers
all of the buffers "empty" at basically the same time, and we get another herd of IO on the Kudu side.

It would be preferable to make the RowBatchQueue "unfair" in some way, such that the main thread exhausts entire IO buffers at a time, rather than pulling little bits from each of the threads in a round-robin fashion.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

trace_Tue_Dec_22_2015_2.27.03_PM.json.gz
22/Dec/15 23:28
126 kB
Todd Lipcon

Activity

People

Assignee:: Unassigned

Reporter:: Todd Lipcon

Votes:: 0 Vote for this issue

Watchers:: 10 Start watching this issue

Dates

Created:: 22/Dec/15 23:21

Updated:: 04/Jun/20 16:50

Resolved:: 04/Jun/20 16:50