[IMPALA-3662] Reduce parquet scanner memory usage - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Critical
Resolution: Fixed
Affects Version/s: Impala 2.6.0
Fix Version/s: Impala 2.8.0
Component/s: Backend
Labels:
- regression
- resource-management

Target Version:

Impala 2.8.0

Description

After ~~IMPALA-2736~~ there was an increase in peak memory consumption mostly due to the Parquet scanner.
In most cases the Parquet scanner ends up buffering more batches than needed.

In the attached profile the scanner memory increases from 2.17GB to 3.3GB.

Workarounds
The following query options may help to reduce scanner memory consumption:

Reduce the number of scanner threads (set num_scanner_threads=30)
Reduce the batch size (set batch_size=512)

Of course, increasing the mem limit may also help.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

tpch_q14_2.5.txt
02/Jun/16 21:45
278 kB
Mostafa Mokhtar
tpch_q14_2.6.txt
02/Jun/16 21:45
281 kB
Mostafa Mokhtar

Activity

People

Assignee:: Michael Ho

Reporter:: Mostafa Mokhtar

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Dates

Created:: 02/Jun/16 21:46

Updated:: 06/Nov/16 03:42

Resolved:: 06/Nov/16 03:42