[IMPALA-7980] High system CPU time usage (and waste) when runtime filters filter out files - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Task
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: Impala 3.2.0
Component/s: None
Labels:
None

Epic Color:
ghx-label-5

Description

When running TPC-DS query 1 on scale factor 10,000 (10TB) on a 140-node cluster with replica_preference=remote, we observed really high system CPU usage for some of the scan nodes:

HDFS_SCAN_NODE (id=6):(Total: 59s107ms, non-child: 59s107ms, % non-
child: 100.00%
- BytesRead: 80.50 MB (84408563)
- ScannerThreadsSysTime: 36m17s

Using

{perf}

, we discovered a lot of usage of

{futex_wait}

and

{pthread_cond_wait}

and so on. (We also used perf to record context switches and cycles.) Interestingly, observing in top saw the really high system CPU usage spike some time into the query.

We believe what's going on is that we start many ScannerThread instances, which wait first until initial ranges have been issued and then grab data using

{impala::io::ScanRange::GetNext()}

. They do this in a loop, and it uses two locks, until the query is done or there are no num_unqueued_files_ left. If num_unqueued_files_ is left above zero, then these threads just loop through two lock acquisitions and nothing else. We believe that this hot loop is eating system CPU aggressively.

It's a bit interesting that this is exacerbated in the case with more remote reads. Our best guess is that some of the reads take significantly longer in this case, and a single outlier can extend this period of waste.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Philip Martin

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 14/Dec/18 16:41

Updated:: 14/Mar/19 14:24

Resolved:: 14/Mar/19 14:24