Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-5686

DefaultContainerExecutor random working dir algorigthm skews results

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • None
    • 2.9.0, 3.0.0-alpha2
    • None
    • None
    • Reviewed

    Description

      long randomPosition = RandomUtils.nextLong() % totalAvailable;
          ...
          while (randomPosition > availableOnDisk[dir]) {
            randomPosition -= availableOnDisk[dir++];
          }
      

      The code above selects a disk based on the random number weighted by the free space on each disk respectively. For example, if I have two disks with 100 bytes each, totalAvailable is 200. The value of randomPosition will be 0..199. 0..99 should select the first disk, 100..199 should select the second disk inclusively. Random number 100 should select the second disk to be fair but this is not the case right now.

      We need to use

      while (randomPosition >= availableOnDisk[dir])
      

      instead of

      while (randomPosition > availableOnDisk[dir])
      

      Attachments

        1. YARN-5686.001.patch
          5 kB
          Vrushali C
        2. YARN-5686.002.patch
          5 kB
          Vrushali C

        Activity

          People

            vrushalic Vrushali C
            miklos.szegedi@cloudera.com Miklos Szegedi
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: