Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-7368

Yarn Work-Preserving Better Handling Failed Disk

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 2.8.1, 3.0.0
    • Fix Version/s: None
    • Component/s: nodemanager, yarn
    • Labels:
      None

      Description

      If the drive that hosts the yarn.nodemanager.recovery.dir is broken then the entire NodeManager will not start. Please improve this so that if the directory is not able to be created/accessed then the recovery portion of the NM is simply skipped and the NM continues to operate as normal.

      It may also be beneficial to be able to define multiple directories, like YARN logging directories, so that if one drive fails, not all of the recovery data is lost.

      https://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-site/NodeManagerRestart.html

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                belugabehr David Mollitor
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated: