Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-506

job tracker hangs on to dead task trackers "forever"

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.7.0
    • Component/s: None
    • Labels:
      None

      Description

      I see cases where a task tracker gets disconnected from the job tracker and disconnects, and then appears twice in the job tracker's list, with one instance being alive and well, and the other's 'time since last heartbeat' increasing monotonically.
      that all makes sense.
      What doesn't make sense, is that the old instances never expire. It's been over 400000 seoncds since the last heartbeat. And the cluster reports having more nodes up and running than its size (350 nodes in a 320 node cluster).

      there should be some reasonable timout for these expired task trackers, somewhere between 10 minutes and an hour.

        Attachments

        1. Hadoop-506.patch
          0.7 kB
          Sanjay Dahiya

          Activity

            People

            • Assignee:
              sanjay.dahiya Sanjay Dahiya
              Reporter:
              yarnon Yoram Arnon
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: