Details

    • Type: Sub-task Sub-task
    • Status: Closed
    • Priority: Major Major
    • Resolution: Won't Fix
    • Affects Version/s: 0.20.204.0
    • Fix Version/s: None
    • Component/s: tasktracker
    • Labels:
      None

      Description

      Like HDFS-1161 but for the TT. The user should be able to configure how many valid disks are needed for operation. Currently the TT will start and accept tasks even if eg only 1 of its 12 disks is working, which leads to poor performance of jobs with tasks that use this machine.

        Activity

        Hide
        Eli Collins added a comment -

        Thought about this some.. I think leaving the current behavior as is (TT keeps running regardless # disk failures) but using a health script that shutsdown the TT when the DN goes down makes more sense. The DN already has logic for shutting down given a sufficient # of disk failures, and it doesn't make sense for the TT to keep running if the DN isn't running. Do think we still need to fix MAPREDUCE-2657, otherwise restarting a cluster may result in a bunch of TTs that were running not coming up because they tolerated a disk failure while running but won't while starting.

        Show
        Eli Collins added a comment - Thought about this some.. I think leaving the current behavior as is (TT keeps running regardless # disk failures) but using a health script that shutsdown the TT when the DN goes down makes more sense. The DN already has logic for shutting down given a sufficient # of disk failures, and it doesn't make sense for the TT to keep running if the DN isn't running. Do think we still need to fix MAPREDUCE-2657 , otherwise restarting a cluster may result in a bunch of TTs that were running not coming up because they tolerated a disk failure while running but won't while starting.
        Hide
        Joep Rottinghuis added a comment -

        Would it be possible to keep the TT up. but scale back the # slots available?
        Something such as 3 out of 10 discs are bad, so only 7/10th of the slots are available?

        Show
        Joep Rottinghuis added a comment - Would it be possible to keep the TT up. but scale back the # slots available? Something such as 3 out of 10 discs are bad, so only 7/10th of the slots are available?
        Hide
        Arun C Murthy added a comment -

        We should strive to make this portable across clusters - thus %disks, not #disks would be better.

        Show
        Arun C Murthy added a comment - We should strive to make this portable across clusters - thus %disks, not #disks would be better.

          People

          • Assignee:
            Unassigned
            Reporter:
            Eli Collins
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development