Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-11608

QueueCapacityVectorInfo NPE when accesible labels config is used

    XMLWordPrintableJSON

Details

    • Reviewed

    Description

      YARN-11514 extended the REST API to contain CapacityVectors for each configured node label. There is an edgecase however: during the initialization the each queue's capacities map will be filled with 0 capacities for the unconfigured, but accessible labels (where there is no configured capacity for the label, however the queue has access to it based on the accessible-node-labels property). A very basic example configuration for this is the following:

      "yarn.scheduler.capacity.root.queues": "a, b"
       "yarn.scheduler.capacity.root.a.capacity": "50");
       "yarn.scheduler.capacity.root.a.accessible-node-labels": "root-a-default-label"
       "yarn.scheduler.capacity.root.a.maximum-capacity": "50"
       "yarn.scheduler.capacity.root.b.capacity": "50"
      

      root.a has access to root-a-default-label, however there is no configured capacity for it. The capacityVectors are parsed based on the configuredCapacity map (created from the "accessible-node-labels.<label>.capacity" configs). When the scheduler info is requested the capacityVectors are collected per label, and the labels used for this are the keySet of the capacity map:

          for (String partitionName : capacities.getExistingNodeLabels()) {
            QueueCapacityVector queueCapacityVector = 
                queue.getConfiguredCapacityVector(partitionName);
            queueCapacityVectorInfo = queueCapacityVector == null ?
                    new QueueCapacityVectorInfo(new QueueCapacityVector()) :
                    new QueueCapacityVectorInfo(queue.getConfiguredCapacityVector(partitionName));
      
      public Set<String> getExistingNodeLabels() {
          readLock.lock();
          try {
            return new HashSet<String>(capacitiesMap.keySet());
          } finally {
            readLock.unlock();
          }
        }
      

      If the capacitiesMap contains entries that are not "configured", this will result in an NPE, breaking the UI and the REST API:

      INTERNAL_SERVER_ERROR
      java.lang.NullPointerException
      	at org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.QueueCapacityVectorInfo.<init>(QueueCapacityVectorInfo.java:39)
      	at org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.QueueCapacitiesInfo.<init>(QueueCapacitiesInfo.java:61)
      	at org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.CapacitySchedulerLeafQueueInfo.populateQueueCapacities(CapacitySchedulerLeafQueueInfo.java:108)
      	at org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.CapacitySchedulerQueueInfo.<init>(CapacitySchedulerQueueInfo.java:137)
      	at org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.CapacitySchedulerLeafQueueInfo.<init>(CapacitySchedulerLeafQueueInfo.java:66)
      	at org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.CapacitySchedulerInfo.getQueues(CapacitySchedulerInfo.java:197)
      	at org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.CapacitySchedulerInfo.<init>(CapacitySchedulerInfo.java:94)
      	at org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices.getSchedulerInfo(RMWebServices.java:399)
      

      There is no need to create capacityVectors for the unconfigured labels, so a null check should solve this issue on the API side.

      Attachments

        Issue Links

          Activity

            People

              bteke Benjamin Teke
              bteke Benjamin Teke
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: