Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-952

Capacity scheduler reconfiguration of queues does not work for add sub-queues to an existing queue

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Later
    • None
    • None
    • capacityscheduler
    • None

    Description

      If we have an existing queue configuration such as

      root

      ---> A
      ---> B

      and we attempt to reconfigure it so that we now have

      root

      ---> A
      ---> A1
      ---> A2
      ---> B

      we get an IOException as follows:

      java.io.IOException: Failed to re-init queues
      at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.reinitialize(CapacityScheduler.java:197)
      at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestLeafQueue.testInitializeQueue(TestLeafQueue.java:206)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at java.lang.reflect.Method.invoke(Method.java:597)
      at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
      at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
      at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
      at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
      at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
      at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
      at org.junit.runners.BlockJUnit4ClassRunner.runNotIgnored(BlockJUnit4ClassRunner.java:79)
      at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:71)
      at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:49)
      at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
      at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
      at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
      at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
      at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
      at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
      at org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:45)
      at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:123)
      at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:104)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at java.lang.reflect.Method.invoke(Method.java:597)
      at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:164)
      at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:110)
      at org.apache.maven.surefire.booter.SurefireStarter.invokeProvider(SurefireStarter.java:172)
      at org.apache.maven.surefire.booter.SurefireStarter.runSuitesInProcessWhenForked(SurefireStarter.java:78)
      at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:70)
      Caused by: java.io.IOException: Trying to reinitialize root.a from root.a
      at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.reinitialize(LeafQueue.java:524)
      at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.reinitialize(ParentQueue.java:360)
      at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.reinitializeQueues(CapacityScheduler.java:240)
      at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.reinitialize(CapacityScheduler.java:194)
      ... 32 more

      This is apparently because the CapacityScheduler still wants to think of A as a LeafQueue instead of realizing it to be updated as a ParentQueue.

      Maybe, this use case is not supposed to be supported, in which case, probably the documentation should be updated to state this scenario as such more clearly (currently, it atleast implies that only deletion is not supported).

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              anupamseth Anupam Seth
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: