Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-5686

DefaultContainerExecutor random working dir algorigthm skews results

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • None
    • 2.9.0, 3.0.0-alpha2
    • None
    • None
    • Reviewed

    Description

      long randomPosition = RandomUtils.nextLong() % totalAvailable;
          ...
          while (randomPosition > availableOnDisk[dir]) {
            randomPosition -= availableOnDisk[dir++];
          }
      

      The code above selects a disk based on the random number weighted by the free space on each disk respectively. For example, if I have two disks with 100 bytes each, totalAvailable is 200. The value of randomPosition will be 0..199. 0..99 should select the first disk, 100..199 should select the second disk inclusively. Random number 100 should select the second disk to be fair but this is not the case right now.

      We need to use

      while (randomPosition >= availableOnDisk[dir])
      

      instead of

      while (randomPosition > availableOnDisk[dir])
      

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            vrushalic Vrushali C Assign to me
            miklos.szegedi@cloudera.com Miklos Szegedi
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment