[IMPALA-8888] Profile fetch performance when result spooling is enabled - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: Not Applicable
Component/s: None
Labels:
None

Epic Color:
ghx-label-7

Description

Profile the performance of fetching rows when result spooling is enabled. There are a few queries that can be used to benchmark the performance:

time ./bin/impala-shell.sh -B -q "select l_orderkey from tpch_parquet.lineitem" > /dev/null

time ./bin/impala-shell.sh -B -q "select * from tpch_parquet.orders" > /dev/null

The first fetches one column and 6,001,215 the second fetches 9 columns and 1,500,000 - so a mix of rows fetched vs. columns fetched.

The base line for the benchmark should be the commit prior to ~~IMPALA-8780~~.

The benchmark should check for both latency and CPU usage (to see if the copy into BufferedTupleStream has a significant overhead).

Various fetch sizes should be used in the benchmark as well to see if increasing the fetch size for result spooling improves performance (ideally it should) (it would be nice to run some fetches between machines as well as that will better reflect network round trip latencies).

Attachments

Activity

People

Assignee:: Sahil Takiar

Reporter:: Sahil Takiar

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 23/Aug/19 16:40

Updated:: 26/Sep/19 15:22

Resolved:: 26/Sep/19 15:22