Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-4743

FairSharePolicy breaks TimSort assumption

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.0.0-alpha2
    • Fix Version/s: 2.9.0, 3.0.0-alpha2
    • Component/s: fairscheduler
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      2016-02-26 14:08:50,821 FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in handling event type NODE_UPDATE to the scheduler
      java.lang.IllegalArgumentException: Comparison method violates its general contract!
               at java.util.TimSort.mergeHi(TimSort.java:868)
               at java.util.TimSort.mergeAt(TimSort.java:485)
               at java.util.TimSort.mergeCollapse(TimSort.java:410)
               at java.util.TimSort.sort(TimSort.java:214)
               at java.util.TimSort.sort(TimSort.java:173)
               at java.util.Arrays.sort(Arrays.java:659)
               at java.util.Collections.sort(Collections.java:217)
               at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.assignContainer(FSLeafQueue.java:316)
               at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.assignContainer(FSParentQueue.java:240)
               at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptScheduling(FairScheduler.java:1091)
               at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.nodeUpdate(FairScheduler.java:989)
               at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1185)
               at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:112)
               at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:684)
               at java.lang.Thread.run(Thread.java:745)
      2016-02-26 14:08:50,822 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Exiting, bbye..
      

      Actually, this bug found in 2.6.0-cdh5.4.7. FairShareComparator is not transitive.

      We get NaN when memorySize=0 and weight=0.

      FairSharePolicy.java
      useToWeightRatio1 = s1.getResourceUsage().getMemorySize() /
        s1.getWeights().getWeight(ResourceType.MEMORY)
      

        Attachments

        1. YARN-4743-v5.patch
          12 kB
          Yufei Gu
        2. YARN-4743-v4.patch
          12 kB
          Zephyr Guo
        3. YARN-4743-v3.patch
          12 kB
          Zephyr Guo
        4. YARN-4743-v2.patch
          10 kB
          Zephyr Guo
        5. timsort.log
          94 kB
          Zephyr Guo
        6. YARN-4743-v1.patch
          9 kB
          Zephyr Guo

          Issue Links

            Activity

              People

              • Assignee:
                gzh1992n Zephyr Guo
                Reporter:
                gzh1992n Zephyr Guo
              • Votes:
                1 Vote for this issue
                Watchers:
                23 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: