[IMPALA-5417] Consider limiting I/O buffer queue size to 2 buffers - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: Impala 2.9.0
Fix Version/s: Impala 2.11.0
Component/s: Backend
Labels:
- resource-management

Target Version:

Impala 2.11.0
Epic Color:
ghx-label-4

Description

Currently the I/O buffer queue size can dynamically vary from 2 buffers up to 128 buffers. Retaining this behaviour while adding a memory constraint to the HDFS scans presents some challenges. The memory constraint is much simpler to implement if the scan node simply hands the I/O mgr a fixed amount of reservation to work with.

Having 2 buffers is clearly beneficial because it allows overlapping I/O and compute.

Having > 2 buffers is not clearly beneficial. There are a few cases:

1. I/O is faster than compute and the queue fills up. In that case there is no benefit to having > 2 buffers because the consumer never sees an empty queue.
2. Compute is faster than I/O and the queue is empty. In that case additional buffering does not help.
3. Compute and I/O are approximately matched, but there is some variability - I/O may get a bit ahead of compute, but compute could speed up or I/O slow down temporarily. With up to 2 buffers in the I/O manager and 1 buffer being processed by the reader, I/O can be 8MB-16MB ahead of compute and therefore absorb bursts of up to 8MB. That seems like it should be enough in most cases. In this case it's also not clear to me that the dynamic sizing algorithm is particularly beneficial - it shrinks the queue aggressively when the scan gets ahead of the consumer.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

benchmark_report_full.txt
12/Jul/17 20:38
76 kB
Tim Armstrong

Activity

People

Assignee:: Tim Armstrong

Reporter:: Tim Armstrong

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 01/Jun/17 16:19

Updated:: 15/Sep/17 06:39

Resolved:: 15/Sep/17 06:39