Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-5686

DefaultContainerExecutor random working dir algorigthm skews results

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.9.0, 3.0.0-alpha2
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      long randomPosition = RandomUtils.nextLong() % totalAvailable;
          ...
          while (randomPosition > availableOnDisk[dir]) {
            randomPosition -= availableOnDisk[dir++];
          }
      

      The code above selects a disk based on the random number weighted by the free space on each disk respectively. For example, if I have two disks with 100 bytes each, totalAvailable is 200. The value of randomPosition will be 0..199. 0..99 should select the first disk, 100..199 should select the second disk inclusively. Random number 100 should select the second disk to be fair but this is not the case right now.

      We need to use

      while (randomPosition >= availableOnDisk[dir])
      

      instead of

      while (randomPosition > availableOnDisk[dir])
      

        Attachments

        1. YARN-5686.001.patch
          5 kB
          Vrushali C
        2. YARN-5686.002.patch
          5 kB
          Vrushali C

          Activity

            People

            • Assignee:
              vrushalic Vrushali C
              Reporter:
              miklos.szegedi@cloudera.com Miklos Szegedi
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: