[TEZ-1162] Tez leaks CodecPool buffers - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 0.5.0
Fix Version/s: 0.5.0, 0.4.1
Component/s: None
Labels:
None

Target Version/s:

0.5.0

Description

Tez Fetcher currently leaks a codec pool (~32kb) for each partition fetched

This causes task failures due to the direct buffers allocated by the codec, which are not freed until a full GC (but whose allocation does not trigger GC).

It is possible to perform the entire shuffle operations without ever triggering a full GC, the following case

Container exited with a non-zero exit code 143
], AttemptID:attempt_1399351577718_2330_1_05_000020_3 Info:Container container_1399351577718_2330_01_000310 COMPLETED with diagnostics set to [Container [pid=1734,containerID=container_1399351577718_2330_01_000310] is running beyond physical memory limits. Current usage: 4.1 GB of 4 GB physical memory used; 5.4 GB of 40 GB virtual memory used. Killing container.

container_1399351577718_2330_01_000365/ $ grep -ri "CodecPool.*brand-new" syslog* | wc -l

6988

That is approx ~436Mb leak on a JVM spun up with -Xmx3500m & 4096m container.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

TEZ-1162.1.patch
03/Jun/14 03:50
1 kB
Gopal Vijayaraghavan

Issue Links

is depended upon by

HIVE-7158 Use Tez auto-parallelism in Hive

Closed

Activity

People

Assignee:: Gopal Vijayaraghavan

Reporter:: Gopal Vijayaraghavan

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 01/Jun/14 00:08

Updated:: 06/Sep/14 01:35

Resolved:: 04/Jun/14 04:39