Description
createKVIterator in ReduceTask contains the following code:
try { Merger.writeFile(rIter, writer, reporter, job); addToMapOutputFilesOnDisk(fs.getFileStatus(outputPath)); } catch (Exception e) { if (null != outputPath) { fs.delete(outputPath, true); } throw new IOException("Final merge failed", e); } finally { if (null != writer) { writer.close(); } }
Merger#writeFile() does not close the file after writing it, so when fs.getFileStatus() is called on it, it may not return the correct length. This causes bad accounting further down the line, which can lead to map output data being lost.