Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
-
None
-
All
Description
Currently, hadoop.tmp.dir configuration variable allows specification of only a single directory to be used as scratch space. In particular, on the job launcher nodes with multiple volumes, this fails the entire job if the tmp.dir is somehow unusable. When the job launcher nodes have multiple volumes, the tmp space availability can be improved by using multiple volumes (either randomly or in round-robin.) The code for choosing a volume from a comma-separated list of multiple volumes is already there for mapred.local.dir etc. That needs to be used by job client as well.