[HADOOP-2095] Reducer failed due to Out ofMemory - ASF JIRA

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 0.15.0
Fix Version/s: 0.18.0
Component/s: None
Labels:
None

Release Note:
Reduced in-memory copies of keys and values as they flow through the Map-Reduce framework. Changed the storage of intermediate map outputs to use new IFile instead of SequenceFile for better compression.

Description

One of the reducers of my job failed with the following exceptions.
The failure caused the whole job fail eventually.
Java heapsize was 768MB and sort.io.mb was 140.

2007-10-23 19:24:06,100 WARN org.apache.hadoop.mapred.ReduceTask: task_200710231912_0001_r_000020_2 Intermediate Merge of the inmemory files threw an exception: java.lang.OutOfMemoryError: Java heap space
at org.apache.hadoop.io.compress.DecompressorStream.(DecompressorStream.java:43)
at org.apache.hadoop.io.compress.DefaultCodec.createInputStream(DefaultCodec.java:71)
at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1345)
at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1231)
at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1154)
at org.apache.hadoop.io.SequenceFile$Sorter$SegmentDescriptor.nextRawKey(SequenceFile.java:2726)
at org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:2543)
at org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:2297)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:1311)
2007-10-23 19:24:06,102 INFO org.apache.hadoop.mapred.ReduceTask: task_200710231912_0001_r_000020_2 done copying task_200710231912_0001_m_001428_0 output .
2007-10-23 19:24:06,185 INFO org.apache.hadoop.fs.FileSystem: Initialized InMemoryFileSystem: ramfs://mapoutput31952838/task_200710231912_0001_r_000020_2/map_1423.out-0 of size (in bytes): 209715200
2007-10-23 19:24:06,193 ERROR org.apache.hadoop.mapred.ReduceTask: Map output copy failure: java.lang.NullPointerException
at org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$FileAttributes.access$300(InMemoryFileSystem.java:366)
at org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$InMemoryFileStatus.(InMemoryFileSystem.java:378)
at org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem.getFileStatus(InMemoryFileSystem.java:283)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
at org.apache.hadoop.fs.FileSystem.getLength(FileSystem.java:449)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:738)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:665)

2007-10-23 19:24:06,193 INFO org.apache.hadoop.mapred.ReduceTask: task_200710231912_0001_r_000020_2 Copying task_200710231912_0001_m_001215_0 output from xxx
2007-10-23 19:24:06,188 INFO org.apache.hadoop.mapred.ReduceTask: task_200710231912_0001_r_000020_2 Copying task_200710231912_0001_m_001211_0 output from xxx
2007-10-23 19:24:06,185 ERROR org.apache.hadoop.mapred.ReduceTask: Map output copy failure: java.lang.NullPointerException
at org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$InMemoryOutputStream.close(InMemoryFileSystem.java:161)
at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:49)
at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:64)
at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.close(ChecksumFileSystem.java:312)
at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:49)
at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:64)
at org.apache.hadoop.mapred.MapOutputLocation.getFile(MapOutputLocation.java:253)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:713)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:665)

2007-10-23 19:24:06,199 INFO org.apache.hadoop.mapred.ReduceTask: task_200710231912_0001_r_000020_2 Copying task_200710231912_0001_m_001247_0 output from .
2007-10-23 19:24:06,200 ERROR org.apache.hadoop.mapred.ReduceTask: Map output copy failure: java.lang.NullPointerException
at org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$FileAttributes.access$300(InMemoryFileSystem.java:366)
at org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$InMemoryFileStatus.(InMemoryFileSystem.java:378)
at org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem.getFileStatus(InMemoryFileSystem.java:283)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
at org.apache.hadoop.fs.FileSystem.getLength(FileSystem.java:449)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:738)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:665)

2007-10-23 19:24:06,204 INFO org.apache.hadoop.mapred.ReduceTask: task_200710231912_0001_r_000020_2 Copying task_200710231912_0001_m_001422_0 output from .
2007-10-23 19:24:06,207 ERROR org.apache.hadoop.mapred.ReduceTask: Map output copy failure: java.lang.NullPointerException
at org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$FileAttributes.access$300(InMemoryFileSystem.java:366)
at org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$InMemoryFileStatus.(InMemoryFileSystem.java:378)
at org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem.getFileStatus(InMemoryFileSystem.java:283)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
at org.apache.hadoop.fs.FileSystem.getLength(FileSystem.java:449)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:738)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:665)

2007-10-23 19:24:06,209 INFO org.apache.hadoop.mapred.ReduceTask: task_200710231912_0001_r_000020_2 Copying task_200710231912_0001_m_001278_0 output from .
2007-10-23 19:24:06,198 WARN org.apache.hadoop.mapred.TaskTracker: Error running child
java.io.IOException: task_200710231912_0001_r_000020_2The reduce copier failed
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:253)
at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)
2007-10-23 19:24:06,198 ERROR org.apache.hadoop.mapred.ReduceTask: Map output copy failure: java.lang.NullPointerException
at org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$FileAttributes.access$300(InMemoryFileSystem.java:366)
at org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$InMemoryFileStatus.(InMemoryFileSystem.java:378)
at org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem.getFileStatus(InMemoryFileSystem.java:283)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
at org.apache.hadoop.fs.FileSystem.getLength(FileSystem.java:449)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:738)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:665)

2007-10-23 19:24:06,231 INFO org.apache.hadoop.mapred.ReduceTask: task_200710231912_0001_r_000020_2 Copying task_200710231912_0001_m_001531_0 output from .
2007-10-23 19:24:06,197 ERROR org.apache.hadoop.mapred.ReduceTask: Map output copy failure: java.lang.NullPointerException
at org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$FileAttributes.access$300(InMemoryFileSystem.java:366)
at org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$InMemoryFileStatus.(InMemoryFileSystem.java:378)
at org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem.getFileStatus(InMemoryFileSystem.java:283)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
at org.apache.hadoop.fs.FileSystem.getLength(FileSystem.java:449)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:738)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:665)

2007-10-23 19:24:06,237 INFO org.apache.hadoop.mapred.ReduceTask: task_200710231912_0001_r_000020_2 Copying task_200710231912_0001_m_001227_0 output from .
2007-10-23 19:24:06,196 ERROR org.apache.hadoop.mapred.ReduceTask: Map output copy failure: java.lang.NullPointerException
at org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$FileAttributes.access$300(InMemoryFileSystem.java:366)
at org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$InMemoryFileStatus.(InMemoryFileSystem.java:378)
at org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem.getFileStatus(InMemoryFileSystem.java:283)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
at org.apache.hadoop.fs.FileSystem.getLength(FileSystem.java:449)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:738)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:665)

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HADOOP-2095_2_20080604.patch
05/Jun/08 02:49
162 kB
Arun Murthy
HADOOP-2095_2_20080604.patch
04/Jun/08 23:56
162 kB
Arun Murthy
HADOOP-2095_2_20080604.patch
04/Jun/08 21:34
161 kB
Arun Murthy
HADOOP-2095_2_20080604.patch
04/Jun/08 20:54
161 kB
Arun Murthy
HADOOP-2095_2_20080604.patch
04/Jun/08 09:55
160 kB
Arun Murthy
HADOOP-2095_CompressedBytesWithCodecPool.patch
31/Oct/07 18:50
4 kB
Arun Murthy
HADOOP-2095_debug.patch
31/Oct/07 18:34
2 kB
Arun Murthy

Issue Links

is blocked by

MAPREDUCE-99 Reducers throw oom exceptions during fetching map outputs

Resolved

is depended upon by

HADOOP-3366 Shuffle/Merge improvements

Closed

Reducer failed due to Out ofMemory

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates