Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-26

The shuffle keeps the ReduceTask locked while doing a FileSystem.rename leading to task timeouts

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Not A Problem
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      The shuffle in ReduceTask.ReduceCopier.MapOutputCopier.copyOutput locks the entire ReduceTask while doing a FileSystem.rename operation. Unfortunately the RawLocalFileSystem implements rename as a copy and delete, which can take a long time. As a result the reduce is being killed as not reporting progress for 10 minutes.

        Activity

        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Open Open Resolved Resolved
        1589d 14h 39m 1 Harsh J 31/Dec/11 09:10
        Harsh J made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Not A Problem [ 8 ]
        Hide
        Harsh J added a comment -

        We attempt a proper rename now, and only fall back to copy+delete otherwise.

        Show
        Harsh J added a comment - We attempt a proper rename now, and only fall back to copy+delete otherwise.
        Owen O'Malley made changes -
        Field Original Value New Value
        Project Hadoop Common [ 12310240 ] Hadoop Map/Reduce [ 12310941 ]
        Key HADOOP-1778 MAPREDUCE-26
        Affects Version/s 0.14.0 [ 12312474 ]
        Component/s mapred [ 12310690 ]
        Hide
        Doug Cutting added a comment -

        I wonder if it's time to fix RawLocalFileSystem in this regard...

        Show
        Doug Cutting added a comment - I wonder if it's time to fix RawLocalFileSystem in this regard...
        Owen O'Malley created issue -

          People

          • Assignee:
            Owen O'Malley
            Reporter:
            Owen O'Malley
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development