Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-5251

Reducer should not implicate map attempt if it has insufficient space to fetch map output

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.23.7, 2.0.4-alpha
    • 0.23.10, 2.1.1-beta
    • mrv2
    • None
    • Reviewed

    Description

      A job can fail if a reducer happens to run on a node with insufficient space to hold a map attempt's output. The reducer keeps reporting the map attempt as bad, and if the map attempt ends up being re-launched too many times before the reducer decides maybe it is the real problem the job can fail.

      In that scenario it would be better to re-launch the reduce attempt and hopefully it will run on another node that has sufficient space to complete the shuffle. Reporting the map attempt is bad and relaunching the map task doesn't change the fact that the reducer can't hold the output.

      Attachments

        1. MAPREDUCE-5251-2.txt
          7 kB
          Ashwin Shankar
        2. MAPREDUCE-5251-3.txt
          5 kB
          Ashwin Shankar
        3. MAPREDUCE-5251-4.txt
          5 kB
          Ashwin Shankar
        4. MAPREDUCE-5251-5.txt
          5 kB
          Ashwin Shankar
        5. MAPREDUCE-5251-6.txt
          5 kB
          Ashwin Shankar
        6. MAPREDUCE-5251-7.txt
          5 kB
          Ashwin Shankar
        7. MAPREDUCE-5251-7-b23.txt
          6 kB
          Ashwin Shankar

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            ashwinshankar77 Ashwin Shankar
            jlowe Jason Darrell Lowe
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Issue deployment