[HIVE-23166] Guard VGB from flushing too often - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 4.0.0
Fix Version/s: None
Component/s: llap
Labels:
None

Description

The existing flush logic in our VectorGroupByOperator is completely static.
It depends on the: number of HtEntries (hive.vectorized.groupby.maxentries) and the MAX memory threshold (by default 90% of available memory)

Assuming that we are not memory constrained the periodicity of flushing is currently dictated by the static number of entries (1M by default) which can be also misconfigured to a very low value.

I am proposing along with maxHtEntries, to also take into account current memory usage, to avoid flushing too ofter as it can hurt op throughput for particular workloads.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HIVE-23166.01.patch
09/Apr/20 15:34
6 kB
Panagiotis Garefalakis
HIVE-23166.02.patch
10/Apr/20 17:18
6 kB
Panagiotis Garefalakis
HIVE-23166.03.patch
12/Apr/20 12:57
6 kB
Panagiotis Garefalakis

Activity

People

Assignee:: Panagiotis Garefalakis

Reporter:: Panagiotis Garefalakis

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 09/Apr/20 15:30

Updated:: 12/Apr/20 22:06

Resolved:: 12/Apr/20 22:06