Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
-
None
Description
When the reader is closed in MergeQueue#adjustPriorityQueue, the byte buffer is still held in several places in the code while unreserve is called. In the case below, the Fetcher was trying to fetch a nearly 100MB map output which exposed this race condition.
Caused by: java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:56) at org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:46) at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput.<init>(MapOutput.java:75) at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MapOutput.createMemoryMapOutput(MapOutput.java:124) at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.unconditionalReserve(MergeManager.java:437) at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.reserve(MergeManager.java:427) at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.copyMapOutput(FetcherOrderedGrouped.java:481) at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.copyFromHost(FetcherOrderedGrouped.java:286) at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.fetchNext(FetcherOrderedGrouped.java:176) at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.run(FetcherOrderedGrouped.java:191)