Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
3.5.0
Description
YARN-10178 tried to fix the issue but there are still 2 property that might change during sorting which causes an exception.
2024-04-10 12:36:56,420 ERROR org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread Thread[Thread-28,5,main] threw an Exception. java.lang.IllegalArgumentException: Comparison method violates its general contract! at java.util.TimSort.mergeHi(TimSort.java:899) at java.util.TimSort.mergeAt(TimSort.java:516) at java.util.TimSort.mergeCollapse(TimSort.java:441) at java.util.TimSort.sort(TimSort.java:245) at java.util.Arrays.sort(Arrays.java:1512) at java.util.stream.SortedOps$SizedRefSortingSink.end(SortedOps.java:348) at java.util.stream.Sink$ChainedReference.end(Sink.java:258) at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:483) at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.policy.PriorityUtilizationQueueOrderingPolicy.getAssignmentIterator(PriorityUtilizationQueueOrderingPolicy.java:260) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.sortAndGetChildrenAllocationIterator(ParentQueue.java:1100) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:1111) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:942) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateOrReserveNewContainers(CapacityScheduler.java:1719) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainerOnSingleNode(CapacityScheduler.java:1654) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1811) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1557) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.schedule(CapacityScheduler.java:539) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler$AsyncScheduleThread.run(CapacityScheduler.java:591)
The `queue.getAccessibleNodeLabels()` and `queue.getPriority()` could change in another thread while the `queues` are being sorted. Those should be saved when constructing the PriorityQueueResourcesForSorting helper object.
Attachments
Issue Links
- relates to
-
YARN-10178 Global Scheduler async thread crash caused by 'Comparison method violates its general contract
- Resolved
- links to