Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-4881

RM continuously switch if HDFS is too busy when NodeLabel is configured

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Critical
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: resourcemanager
    • Labels:
      None

      Description

      It is observed in the production cluster that RM fail to become active and keep continuously switching if the HDFS is too busy and node label is configured. This is causing RM down time as very high.

      Exception from RM logs

      Caused by: org.apache.hadoop.service.ServiceStateException: org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/mapred/node-labels/nodelabel.mirror.writing could only be replicated to 0 nodes instead of minReplication (=1). There are 7 datanode(s) running and no node(s) are excluded in this operation.
      

        Attachments

        Issue Links

          Activity

          $i18n.getText('security.level.explanation', $currentSelection) Viewable by All Users
          Cancel

            People

            • Assignee:
              Unassigned Assign to me
              Reporter:
              rohithsharma Rohith Sharma K S

              Dates

              • Created:
                Updated:

                Issue deployment