Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-311

JobClient should use multiple volumes as hadoop.tmp.dir

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None
    • All

    Description

      Currently, hadoop.tmp.dir configuration variable allows specification of only a single directory to be used as scratch space. In particular, on the job launcher nodes with multiple volumes, this fails the entire job if the tmp.dir is somehow unusable. When the job launcher nodes have multiple volumes, the tmp space availability can be improved by using multiple volumes (either randomly or in round-robin.) The code for choosing a volume from a comma-separated list of multiple volumes is already there for mapred.local.dir etc. That needs to be used by job client as well.

      Attachments

        Activity

          People

            Unassigned Unassigned
            milindb Milind Barve
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: