Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
2.7.2
-
None
-
Reviewed
Description
We upgraded our clusters to 2.7.2 from 2.4.1 and saw the following exception in RM logs :
Caused by: org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationConfigurationException: Both <reservation> and type="parent" found for queue root.adhoc which is unsupported at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.loadQueue(AllocationFileLoaderService.java:519) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.reloadAllocations(AllocationFileLoaderService.java:352) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.initScheduler(FairScheduler.java:1440)
From the exception, it looks like we've configured 'reservation', but we've not. The issue is that AllocationFileLoaderService#loadQueue assumes that a parent queue marked as 'type=parent' cannot have configured child queues. That can be a problem in cases where we mark a queue as 'parent' which has no configured child queues to start with, but we can add child queues later on.
Also the exception message is kind of misleading since we haven't configured 'reservation'.
How to reproduce:
Run fair scheduler with following queue config:
<queue name="p" type="parent"> <weight>10</weight> <maxRunningApps>300</maxRunningApps> <queue name="c"> <weight>3</weight> </queue> </queue>
Attachments
Attachments
Issue Links
- is broken by
-
YARN-2738 Add FairReservationSystem for FairScheduler
- Closed