Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-3598

Map-Reduce framework needlessly creates temporary _${taskid} directories for Maps

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • 0.18.0
    • 0.18.0
    • None
    • None
    • Incompatible change
    • Changed Map-Reduce framework to no longer create temporary task output directories for staging outputs if staging outputs isn't necessary. ${mapred.out.dir}/_temporary/_${taskid}

    Description

      The staging directory for task-outputs (i.e. ${mapred.out.dir}/temporary/${taskid}) should only be created when Maps produce output on HDFS, which usually isn't the case. This plays very badly with HDFS quotas and may lead to thousands of temp names in the FS namespace, there-by overhauling the quotas. IAC, it isn't good to needlessly create these directories.

      Attachments

        1. HADOOP-3598_1_20080620.patch
          12 kB
          Arun Murthy
        2. HADOOP-3598_0_20080619.patch
          12 kB
          Arun Murthy
        3. HADOOP-3598_0_20080619.patch
          8 kB
          Arun Murthy
        4. HADOOP-3598_0_20080619.patch
          7 kB
          Arun Murthy

        Activity

          People

            acmurthy Arun Murthy
            acmurthy Arun Murthy
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: