I found that, even if the output of a map task fits entirely in its sort buffer, it was rewriting the output entirely rather than just renaming the first spill into place. This is due to RawLocalFileSystem.rename() falling back to a copy if renameTo() fails. The first rename attempt was failing because no one has called mkdir for the output directory yet.
|Status||Resolved [ 5 ]||Closed [ 6 ]|
|Status||Patch Available [ 10002 ]||Resolved [ 5 ]|
|Hadoop Flags||Reviewed [ 10343 ]|
|Target Version/s||0.23.0 [ 12315570 ]|
|Fix Version/s||0.23.0 [ 12315570 ]|
|Resolution||Fixed [ 1 ]|
|Status||Open [ 1 ]||Patch Available [ 10002 ]|