Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-6420

RM startup failure due to wrong order in nodelabel editlog

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • None
    • 2.9.0, 3.0.0-alpha4, 2.8.2
    • None
    • None
    • Reviewed

    Description

      Edit log file for nodelabel written in wrong order if StoreNewClusterNodeLabels addition is delayed and UpdateNodeToLabelsMappingsEvent is added to dispatcher.

      Configure RM admin client thread count to 2
      Add node label to cluster X - Client 1
      Delay event addition to dispatcher
      Replace node label on node1 to X - Client 2
      Make sure UpdateNodeToLabelsMappingsEvent added to dispatcher first.
      Restart resource manager

      2017-03-31 16:20:42,236 | WARN  | main-EventThread | Exception handling the winning of election | ActiveStandbyElector.java:836
      org.apache.hadoop.ha.ServiceFailedException: RM could not transition to Active
              at org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.becomeActive(EmbeddedElectorService.java:128)
              at org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:832)
              at org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:422)
              at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:728)
              at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:600)
      Caused by: org.apache.hadoop.ha.ServiceFailedException: Error when transitioning to Active mode
              at org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:331)
              at org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.becomeActive(EmbeddedElectorService.java:126)
              ... 4 more
      Caused by: org.apache.hadoop.service.ServiceStateException: java.io.IOException: Not all labels being replaced contained by known label collections, please check, new labels=[1]
              at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
              at org.apache.hadoop.service.AbstractService.start(AbstractService.java:204)
              at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
              at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:642)
              at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
              at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1042)
              at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1083)
              at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1079)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:422)
      
      

      Attachments

        1. YARN-6420.0001.patch
          8 kB
          Bibin Chundatt
        2. YARN-6420.0002.patch
          3 kB
          Bibin Chundatt
        3. YARN-6420.0003.patch
          2 kB
          Bibin Chundatt

        Activity

          People

            bibinchundatt Bibin Chundatt
            bibinchundatt Bibin Chundatt
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: