[HIVE-17684] HoS memory issues with MapJoinMemoryExhaustionHandler - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 4.0.0-alpha-1
Component/s: Spark
Labels:
None

Target Version/s:

3.0.0

Description

We have seen a number of memory issues due the HashSinkOperator use of the MapJoinMemoryExhaustionHandler. This handler is meant to detect scenarios where the small table is taking too much space in memory, in which case a MapJoinMemoryExhaustionError is thrown.

The configs to control this logic are:

hive.mapjoin.localtask.max.memory.usage (default 0.90)
hive.mapjoin.followby.gby.localtask.max.memory.usage (default 0.55)

The handler works by using the MemoryMXBean and uses the following logic to estimate how much memory the HashMap is consuming: MemoryMXBean#getHeapMemoryUsage().getUsed() / MemoryMXBean#getHeapMemoryUsage().getMax()

The issue is that MemoryMXBean#getHeapMemoryUsage().getUsed() can be inaccurate. The value returned by this method returns all reachable and unreachable memory on the heap, so there may be a bunch of garbage data, and the JVM just hasn't taken the time to reclaim it all. This can lead to intermittent failures of this check even though a simple GC would have reclaimed enough space for the process to continue working.

We should re-think the usage of MapJoinMemoryExhaustionHandler for HoS. In Hive-on-MR this probably made sense to use because every Hive task was run in a dedicated container, so a Hive Task could assume it created most of the data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks running in a single executor, each doing different things.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HIVE-17684.01.patch
18/Dec/17 21:15
6 kB
Misha Dmitriev
HIVE-17684.02.patch
19/Dec/17 19:26
6 kB
Misha Dmitriev
HIVE-17684.03.patch
24/Jul/18 00:51
6 kB
Misha Dmitriev
HIVE-17684.04.patch
27/Jul/18 20:16
6 kB
Misha Dmitriev
HIVE-17684.05.patch
04/Sep/18 20:05
8 kB
Sahil Takiar
HIVE-17684.06.patch
06/Sep/18 20:27
8 kB
Sahil Takiar
HIVE-17684.07.patch
12/Sep/18 04:32
22 kB
Misha Dmitriev
HIVE-17684.08.patch
13/Sep/18 21:40
27 kB
Sahil Takiar
HIVE-17684.09.patch
15/Sep/18 01:15
28 kB
Misha Dmitriev
HIVE-17684.10.patch
19/Sep/18 21:05
28 kB
Misha Dmitriev
HIVE-17684.11.patch
20/Sep/18 21:58
28 kB
Sahil Takiar

Issue Links

is blocked by

HIVE-18319 Upgrade to Hadoop 3.0.0

Closed

Activity

People

Assignee:: Misha Dmitriev

Reporter:: Sahil Takiar

Votes:: 0 Vote for this issue

Watchers:: 11 Start watching this issue

Dates

Created:: 03/Oct/17 23:01

Updated:: 17/Nov/22 08:54

Resolved:: 26/Sep/18 01:23