Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.10.0, 3.2.0, 3.1.1, 2.9.2, 2.8.5, 3.0.4
    • Component/s: None
    • Labels:
      None

      Description

      shuffle LOCAL_DIRS, LOG_DIRS and LOCAL_USER_DIRS when launching container. Some applications will process these in exactly the same way in every container (e.g. roundrobin) which can cause disks to get unnecessarily overloaded (e.g. one output file written to first entry specified in the environment variable).

      There are two paths for local dir allocation, depending on whether the size is unknown or known.  The unknown path already uses a random algorithm.  The known path initializes with a random starting point, and then goes round-robin after that.  When selecting a dir, it increments the last used by one and then checks sequentially until it finds a dir that satisfies the request.  Proposal is to increment by a random value of between 1 and num_dirs - 1, and then check sequentially from there.  This should result in a more random selection in all cases.

        Attachments

        1. HADOOP-15548.001.patch
          3 kB
          Jim Brennan
        2. HADOOP-15548.002.patch
          4 kB
          Jim Brennan
        3. HADOOP-15548-branch-2.001.patch
          4 kB
          Jim Brennan

          Activity

            People

            • Assignee:
              Jim_Brennan Jim Brennan
              Reporter:
              Jim_Brennan Jim Brennan
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: