Details

    • Type: Sub-task Sub-task
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: nodemanager
    • Labels:
      None

      Description

      This is the MR counterpart to HDFS-1848. Like HDFS volume failure detection, NM disk failure detection checks a subset of the disks, and a subset of the directories. Eg the TT and the NM do not check the root disk for errors unless a local dir resides on them. Even if a local dir resides on the root disk the disk checking code only checks the local dirs so a failure only seen when accessing a part of the disk no hosting the local dirs will not be noticed. The disk that hosts the logs, pid, tmp dirs etc is critical, so if needs to be checked as well, and the NM should shutdown if a critical disk is not available (to prevent MR issues similar to HDFS-1848 and HDFS-2095). Typically people currently work around this limitation by (aside from ignoring it) by using raid-1 for the root disk or a health script that checks the root disk health.

        Issue Links

          Activity

          Eli Collins created issue -
          Eli Collins made changes -
          Field Original Value New Value
          Link This issue relates to HDFS-2095 [ HDFS-2095 ]
          Eli Collins made changes -
          Link This issue relates to MAPREDUCE-3121 [ MAPREDUCE-3121 ]
          Vinod Kumar Vavilapalli made changes -
          Parent MAPREDUCE-3121 [ 12525190 ]
          Issue Type Improvement [ 4 ] Sub-task [ 7 ]
          Vinod Kumar Vavilapalli made changes -
          Link This issue relates to MAPREDUCE-3121 [ MAPREDUCE-3121 ]
          Vinod Kumar Vavilapalli made changes -
          Parent MAPREDUCE-3121 [ 12525190 ]
          Issue Type Sub-task [ 7 ] Bug [ 1 ]
          Vinod Kumar Vavilapalli made changes -
          Project Hadoop Map/Reduce [ 12310941 ] Hadoop YARN [ 12313722 ]
          Key MAPREDUCE-3474 YARN-92
          Affects Version/s 0.23.0 [ 12315570 ]
          Affects Version/s 0.20.205.0 [ 12316391 ]
          Component/s nodemanager [ 12319323 ]
          Component/s tasktracker [ 12312906 ]
          Component/s nodemanager [ 12315341 ]
          Vinod Kumar Vavilapalli made changes -
          Parent YARN-91 [ 12606698 ]
          Issue Type Bug [ 1 ] Sub-task [ 7 ]

            People

            • Assignee:
              Unassigned
              Reporter:
              Eli Collins
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:

                Development