Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-935

There's little to be gained by putting a host into the penaltybox at reduce time, if its the only host you have

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.21.0
    • None
    • tasktracker
    • None

    Description

      Exponential backoff may be good for dealing with troublesome hosts, but not if you only have one host in the entire system. From the log of TestNodeRefresh, which for some reason is blocking in the reduce phase, I can see it doesn't take much for the backoff to kick in so rapidly that the reducer is waiting for longer than the test

      2009-08-28 21:39:16,788 WARN  mapred.ReduceTask (ReduceTask.java:fetchOutputs(2192)) - attempt_20090828213826033_0001_r_000000_0 adding host localhost to penalty box, next contact in 150 seconds
      

      The result of this backoff process is that the reduce process ends up appearing to hang, getting killed from above.

      Note that this isn't the root cause of the problem, but it certainly amplifies things.

      Attachments

        Activity

          People

            Unassigned Unassigned
            stevel@apache.org Steve Loughran
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated: