[HADOOP-15548] Randomize local dirs - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Minor
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 2.10.0, 3.2.0, 3.1.1, 2.9.2, 2.8.5, 3.0.4
Component/s: None
Labels:
None

Description

shuffle LOCAL_DIRS, LOG_DIRS and LOCAL_USER_DIRS when launching container. Some applications will process these in exactly the same way in every container (e.g. roundrobin) which can cause disks to get unnecessarily overloaded (e.g. one output file written to first entry specified in the environment variable).

There are two paths for local dir allocation, depending on whether the size is unknown or known. The unknown path already uses a random algorithm. The known path initializes with a random starting point, and then goes round-robin after that. When selecting a dir, it increments the last used by one and then checks sequentially until it finds a dir that satisfies the request. Proposal is to increment by a random value of between 1 and num_dirs - 1, and then check sequentially from there. This should result in a more random selection in all cases.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HADOOP-15548-branch-2.001.patch
29/Jun/18 19:55
4 kB
Jim Brennan
HADOOP-15548.002.patch
28/Jun/18 22:14
4 kB
Jim Brennan
HADOOP-15548.001.patch
21/Jun/18 18:52
3 kB
Jim Brennan

Activity

People

Assignee:: Jim Brennan

Reporter:: Jim Brennan

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Dates

Created:: 20/Jun/18 14:16

Updated:: 14/Oct/19 15:37

Resolved:: 29/Jun/18 20:49