Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-8951

Defining default queue placement rule in allocations file with create="false" throws an NPE

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      If the default queue placement rule is defined with create="false" and a scheduling request is created for queue "root.default", then FairScheduler#assignToQueue throws an NPE, while trying to construct an error message in the catch block of IllegalStateException, relying on the fact that the rmApp is not null but it is.

      Example of such a config file:

      <?xml version="1.0"?>
      <allocations>
      	<queue name="parentq" type="parent">
      		<minResources>1024mb,0vcores</minResources>
      	</queue>
      	<queuePlacementPolicy>
      		<rule name="default" create="false"/>
      	</queuePlacementPolicy>
      </allocations>
      

      This is suspicious, as there are some null checks for rmApp in the same method.
      Not sure if this is a special case for the tests or it is reproducable in a cluster, this needs further investigation.

      In any case, it's not good that we try to dereference the rmApp that is null.

      On the other hand, I'm not sure if the default queue placement rule with create="false" makes sense at all. Looking at the documentation (https://hadoop.apache.org/docs/r3.1.0/hadoop-yarn/hadoop-yarn-site/FairScheduler.html):

      default: the app is placed into the queue specified in the ‘queue’ attribute of the default rule. If ‘queue’ attribute is not specified, the app is placed into ‘root.default’ queue.

      A queuePlacementPolicy element: which contains a list of rule elements that tell the scheduler how to place incoming apps into queues. Rules are applied in the order that they are listed. Rules may take arguments. All rules accept the “create” argument, which indicates whether the rule can create a new queue. “Create” defaults to true; if set to false and the rule would place the app in a queue that is not configured in the allocations file, we continue on to the next rule. The last rule must be one that can never issue a continue....

      In this case, the rule has the queue property suppressed so the apps should be placed to the root.default queue (which is an undefined queue according to the config file), and create is false, meaning that the queue root.default cannot be created at all.

      This seems to be a case of an invalid queue configuration file for me.

      jlowe, leftnoteasy: What is your take on this?

       

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              snemeth Szilard Nemeth
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: