Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Not A Problem
-
None
-
None
-
None
-
None
Description
The shuffle in ReduceTask.ReduceCopier.MapOutputCopier.copyOutput locks the entire ReduceTask while doing a FileSystem.rename operation. Unfortunately the RawLocalFileSystem implements rename as a copy and delete, which can take a long time. As a result the reduce is being killed as not reporting progress for 10 minutes.