Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-8833

Avoid potential integer overflow when computing fair shares

    XMLWordPrintableJSON

    Details

    • Hadoop Flags:
      Reviewed

      Description

      When use w2rRatio compute fair share, there may be a chance triggering the problem of Int overflow, and entering an infinite loop.

      Since the compute share thread holds the writeLock, it may blocking scheduling thread.

      This issue occurs in a production environment. And we have already fixed it.

       

      added 2018-10-29: elaborate the problem 

      /**

      • Compute the resources that would be used given a weight-to-resource ratio
      • w2rRatio, for use in the computeFairShares algorithm as described in #
        */
        private static int resourceUsedWithWeightToResourceRatio(double w2rRatio,
        Collection<? extends Schedulable> schedulables, String type) { int resourcesTaken = 0; for (Schedulable sched : schedulables) { int share = computeShare(sched, w2rRatio, type); resourcesTaken += share; }

        return resourcesTaken;
        }

      The variable resourcesTaken is an integer type. And it also is accumulated value of result of

      computeShare(Schedulable sched, double w2rRatio,String type) which is a value between the min share and max share of a queue.

      For example, when there are 3 queues, each has min share = max share = 

      Integer.MAX_VALUE, the resourcesTaken will be out of Integer bound, and it will be a negative number.

      when resourceUsedWithWeightToResourceRatio(double w2rRatio, Collection<? extends Schedulable> schedulables, String type) return a negative number, the loop in 

      computeSharesInternal() may never out which got the scheduler lock.

       

      //org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.policies.ComputeFairShares

      while (resourceUsedWithWeightToResourceRatio(rMax, schedulables, type)
      < totalResource)

      { rMax *= 2.0; }

      This may blocking scheduling thread.

        Attachments

        1. YARN-8833.1.patch
          5 kB
          liyakun
        2. YARN-8833.2.patch
          5 kB
          liyakun
        3. YARN-8833.3.patch
          5 kB
          liyakun
        4. YARN-8833.patch
          6 kB
          liyakun
        5. YARN-8833-branch-2.003.patch
          15 kB
          liyakun
        6. YARN-8833-branch-2.1.patch
          5 kB
          liyakun
        7. YARN-8833-branch-2.2.patch
          15 kB
          liyakun

          Issue Links

            Activity

              People

              • Assignee:
                yoelee liyakun
                Reporter:
                yoelee liyakun
              • Votes:
                0 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: