Pig
  1. Pig
  2. PIG-129

need to create temp files in the task's working directory

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.1.0
    • Component/s: None
    • Labels:
      None

      Description

      Currently, pig creates temp data such is spilled bags in the directory specified by java.io.tmpdir. The problem is that this directory is usually shared by all tasks and can easily run out of space.

      A better approach would be to create this files in the temp dir inside of the taks working directory as these locations usually have much mor space and also they can be hosted on different disks so the performance could be better.

      There are 2 parts to this fix:

      (1) in org.apache.pig.data.DataBag to check if the temp directory exists and create it if not before trying to create the temp file. This is somewhere around line 390 in the code.
      (2) Change the mapred.child.java.opts in hadoop-site.xml to include new value for tmpdir property to point to ./tmp. For instance:
      <property>
      <name>mapred.child.java.opts</name>
      <value>-Xmx1024M -Djava.io.tmpdir="./tmp"</value>
      <description>arguments passed to child jvms</description>
      </property>

      1. PIG-129.patch
        3 kB
        Amir Youssefi
      2. TempAllocator0.patch
        13 kB
        Pi Song

        Activity

        Olga Natkovich created issue -
        Pi Song made changes -
        Field Original Value New Value
        Attachment TempAllocator0.patch [ 12377051 ]
        Amir Youssefi made changes -
        Attachment PIG-129.patch [ 12377297 ]
        Olga Natkovich made changes -
        Resolution Fixed [ 1 ]
        Status Open [ 1 ] Resolved [ 5 ]
        Owen O'Malley made changes -
        Workflow jira [ 12424811 ] no-reopen-closed, patch-avail [ 12425447 ]
        Olga Natkovich made changes -
        Fix Version/s 0.1.0 [ 12312848 ]
        Alan Gates made changes -
        Status Resolved [ 5 ] Closed [ 6 ]

          People

          • Assignee:
            Amir Youssefi
            Reporter:
            Olga Natkovich
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development