Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-4278

cannot run two local jobs in parallel from the same gateway.

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.20.205.0
    • Fix Version/s: 1.2.0, 2.0.3-alpha, 0.23.7
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      I cannot run two local mode jobs from Pig in parallel from the same gateway, this is a typical use case. If I re-run the tests sequentially, then the test pass. This seems to be a problem from Hadoop.

      Additionally, the pig harness, expects to be able to run Pig-version-undertest against Pig-version-stable from the same gateway.

      To replicate the error:

      I have two clusters running from the same gateway.
      If I run the Pig regression suites nightly.conf in local mode in paralell - once on each cluster. Conflicts in M/R local mode result in failures in the tests.

      ERROR1:

      org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find
      output/file.out in any of the configured local directories
      at
      org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:429)
      at
      org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:160)
      at
      org.apache.hadoop.mapred.MapOutputFile.getOutputFile(MapOutputFile.java:56)
      at org.apache.hadoop.mapred.Task.calculateOutputSize(Task.java:944)
      at org.apache.hadoop.mapred.Task.sendLastUpdate(Task.java:924)
      at org.apache.hadoop.mapred.Task.done(Task.java:875)
      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:374)

      ERROR2:

      2012-05-17 20:25:36,762 [main] INFO
      org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
      -
      HadoopJobId: job_local_0001
      2012-05-17 20:25:36,778 [Thread-3] INFO org.apache.hadoop.mapred.Task -
      Using ResourceCalculatorPlugin : org.apache.
      hadoop.util.LinuxResourceCalculatorPlugin@ffa490e
      2012-05-17 20:25:36,837 [Thread-3] WARN
      org.apache.hadoop.mapred.LocalJobRunner - job_local_0001
      java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
      at java.util.ArrayList.RangeCheck(ArrayList.java:547)
      at java.util.ArrayList.get(ArrayList.java:322)
      at
      org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getLoadFunc(PigInputFormat.java
      :153)
      at
      org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.createRecordReader(PigInputForm
      at.java:106)
      at
      org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.<init>(MapTask.java:489)
      at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:731)
      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
      at
      org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
      2012-05-17 20:25:41,291 [main] INFO
      org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher

      1. MAPREDUCE-4278-2-branch1.patch
        1 kB
        Sandy Ryza
      2. MAPREDUCE-4278-3-branch1.patch
        1 kB
        Sandy Ryza
      3. MAPREDUCE-4278-branch1.patch
        1 kB
        Sandy Ryza
      4. MAPREDUCE-4278-trunk.patch
        3 kB
        Sandy Ryza
      5. MAPREDUCE-4278-trunk.patch
        3 kB
        Sandy Ryza

        Issue Links

          Activity

          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-0.23-Build #498 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/498/)
          MAPREDUCE-4278. cannot run two local jobs in parallel from the same gateway. (Sandy Ryza via tgraves) (Revision 1434878)

          Result = FAILURE
          tgraves : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1434878
          Files :

          • /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
          • /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapred/LocalJobRunner.java
          • /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/JobID.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-0.23-Build #498 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/498/ ) MAPREDUCE-4278 . cannot run two local jobs in parallel from the same gateway. (Sandy Ryza via tgraves) (Revision 1434878) Result = FAILURE tgraves : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1434878 Files : /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapred/LocalJobRunner.java /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/JobID.java
          Hide
          Thomas Graves added a comment -

          merged to branch-0.23

          Show
          Thomas Graves added a comment - merged to branch-0.23
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk #1308 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1308/)
          MAPREDUCE-4278. Cannot run two local jobs in parallel from the same gateway. Contributed by Sandy Ryza. (Revision 1430363)

          Result = FAILURE
          tomwhite : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1430363
          Files :

          • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapred/LocalJobRunner.java
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/JobID.java
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #1308 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1308/ ) MAPREDUCE-4278 . Cannot run two local jobs in parallel from the same gateway. Contributed by Sandy Ryza. (Revision 1430363) Result = FAILURE tomwhite : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1430363 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapred/LocalJobRunner.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/JobID.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk #1280 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1280/)
          MAPREDUCE-4278. Cannot run two local jobs in parallel from the same gateway. Contributed by Sandy Ryza. (Revision 1430363)

          Result = FAILURE
          tomwhite : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1430363
          Files :

          • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapred/LocalJobRunner.java
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/JobID.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #1280 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1280/ ) MAPREDUCE-4278 . Cannot run two local jobs in parallel from the same gateway. Contributed by Sandy Ryza. (Revision 1430363) Result = FAILURE tomwhite : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1430363 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapred/LocalJobRunner.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/JobID.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Yarn-trunk #91 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/91/)
          MAPREDUCE-4278. Cannot run two local jobs in parallel from the same gateway. Contributed by Sandy Ryza. (Revision 1430363)

          Result = SUCCESS
          tomwhite : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1430363
          Files :

          • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapred/LocalJobRunner.java
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/JobID.java
          Show
          Hudson added a comment - Integrated in Hadoop-Yarn-trunk #91 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/91/ ) MAPREDUCE-4278 . Cannot run two local jobs in parallel from the same gateway. Contributed by Sandy Ryza. (Revision 1430363) Result = SUCCESS tomwhite : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1430363 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapred/LocalJobRunner.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/JobID.java
          Hide
          Tom White added a comment -

          +1 I just committed this. Thanks, Sandy!

          Show
          Tom White added a comment - +1 I just committed this. Thanks, Sandy!
          Hide
          Hudson added a comment -

          Integrated in Hadoop-trunk-Commit #3191 (See https://builds.apache.org/job/Hadoop-trunk-Commit/3191/)
          MAPREDUCE-4278. Cannot run two local jobs in parallel from the same gateway. Contributed by Sandy Ryza. (Revision 1430363)

          Result = SUCCESS
          tomwhite : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1430363
          Files :

          • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapred/LocalJobRunner.java
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/JobID.java
          Show
          Hudson added a comment - Integrated in Hadoop-trunk-Commit #3191 (See https://builds.apache.org/job/Hadoop-trunk-Commit/3191/ ) MAPREDUCE-4278 . Cannot run two local jobs in parallel from the same gateway. Contributed by Sandy Ryza. (Revision 1430363) Result = SUCCESS tomwhite : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1430363 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapred/LocalJobRunner.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/JobID.java
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12562733/MAPREDUCE-4278-3-branch1.patch
          against trunk revision .

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3183//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12562733/MAPREDUCE-4278-3-branch1.patch against trunk revision . -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3183//console This message is automatically generated.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12562599/MAPREDUCE-4278-trunk.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3181//testReport/
          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3181//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12562599/MAPREDUCE-4278-trunk.patch against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3181//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3181//console This message is automatically generated.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12548802/MAPREDUCE-4278-2-branch1.patch
          against trunk revision .

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2927//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12548802/MAPREDUCE-4278-2-branch1.patch against trunk revision . -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2927//console This message is automatically generated.
          Hide
          Sandy Ryza added a comment -

          findbugs 2 advised that calling Math.abs(rand.nextInt()) was a bad idea so I switched it to rand.nextInt(Integer.MAX_VALUE) and uploaded a new patch. also uploading a patch for trunk.

          Show
          Sandy Ryza added a comment - findbugs 2 advised that calling Math.abs(rand.nextInt()) was a bad idea so I switched it to rand.nextInt(Integer.MAX_VALUE) and uploaded a new patch. also uploading a patch for trunk.
          Hide
          Sandy Ryza added a comment -

          I was able to reproduce the issue using Pig and verify that the patch fixed it.

          Show
          Sandy Ryza added a comment - I was able to reproduce the issue using Pig and verify that the patch fixed it.
          Hide
          Araceli Henley added a comment -

          I was able to reproduce the problem easily by kicking off the pig end to
          end tests in parallel. For example, if you kick off the nightly.conf in
          parallel from the same gateway, there are always conflicts. The specific
          conflict varies from run to run depending on the timing, but there are
          always conflicts.

          Show
          Araceli Henley added a comment - I was able to reproduce the problem easily by kicking off the pig end to end tests in parallel. For example, if you kick off the nightly.conf in parallel from the same gateway, there are always conflicts. The specific conflict varies from run to run depending on the timing, but there are always conflicts.
          Hide
          Tom White added a comment -

          You're right - it's not easy to create a unit test where the job IDs collide with the current code. Can you run a manual test without the patch that runs two jobs and produces a collision, and then test that with the patch there is no collision as a sanity check?

          > Also, I realized that with my approach the randids could get mixed if two jobs were submitted concurrently using the same LocalJobRunner. Is this a concern?

          LocalJobRunner doesn't support running multiple jobs concurrently, so I don't think your change makes things worse. We could add some class javadoc to clarify what it supports (i.e. use an instance of LJR per job to run multiple jobs in a single JVM).

          Show
          Tom White added a comment - You're right - it's not easy to create a unit test where the job IDs collide with the current code. Can you run a manual test without the patch that runs two jobs and produces a collision, and then test that with the patch there is no collision as a sanity check? > Also, I realized that with my approach the randids could get mixed if two jobs were submitted concurrently using the same LocalJobRunner. Is this a concern? LocalJobRunner doesn't support running multiple jobs concurrently, so I don't think your change makes things worse. We could add some class javadoc to clarify what it supports (i.e. use an instance of LJR per job to run multiple jobs in a single JVM).
          Hide
          Sandy Ryza added a comment -

          If the jobs are in the same process they're already prevented from colliding by the (not sure what to call it) 0001 part of the job id. Do you have any advice on how to test it in light of this?

          Also, I realized that with my approach the randids could get mixed if two jobs were submitted concurrently using the same LocalJobRunner. Is this a concern?

          Show
          Sandy Ryza added a comment - If the jobs are in the same process they're already prevented from colliding by the (not sure what to call it) 0001 part of the job id. Do you have any advice on how to test it in light of this? Also, I realized that with my approach the randids could get mixed if two jobs were submitted concurrently using the same LocalJobRunner. Is this a concern?
          Hide
          Tom White added a comment -

          This looks like the right approach. Can you write a unit test that shows that this fixes the problem? You could use a waiting job from UtilsForTests that ensures two jobs are running concurrently.

          Show
          Tom White added a comment - This looks like the right approach. Can you write a unit test that shows that this fixes the problem? You could use a waiting job from UtilsForTests that ensures two jobs are running concurrently.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12546910/MAPREDUCE-4278-branch1.patch
          against trunk revision .

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2887//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12546910/MAPREDUCE-4278-branch1.patch against trunk revision . -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2887//console This message is automatically generated.
          Hide
          Sandy Ryza added a comment -

          If I understand correctly, the job configuration file is named after the job id, which the unique identifier would be a part of, so they would not clash.

          Show
          Sandy Ryza added a comment - If I understand correctly, the job configuration file is named after the job id, which the unique identifier would be a part of, so they would not clash.
          Hide
          Tom White added a comment -

          This could be avoided by adding a timestamp component to local job ids?

          It looks like getStagingAreaDir() is using a random number to generate a unique staging directory, so you could reuse that unique identifier for the job ID. Also, the local job directory (localRunner) needs to be made unique too, otherwise the job configuration file could clash.

          Show
          Tom White added a comment - This could be avoided by adding a timestamp component to local job ids? It looks like getStagingAreaDir() is using a random number to generate a unique staging directory, so you could reuse that unique identifier for the job ID. Also, the local job directory (localRunner) needs to be made unique too, otherwise the job configuration file could clash.
          Hide
          Andrew Tindle added a comment -

          Hi - I should have updated this earlier when I found this solution.

          You can use the following PIG_OPTS parameters to define the directories that you want the map/reduce jobs to use, prior to calling pig.

          -Dmapred.local.dir
          -Dmapred.output.dir
          -Dmapred.system.dir
          -Dmapred.temp.dir

          For example, I define them in my program to point to different directories based upon the pig script being called, as below:

          PIG_OPTS="-Dmapred.output.dir=$

          {MAPRED}/$pig_script/output"
          PIG_OPTS="$PIG_OPTS -Dmapred.local.dir=${MAPRED}

          /$pig_script/local"
          PIG_OPTS="$PIG_OPTS -Dmapred.system.dir=$

          {MAPRED}/$pig_script/system"
          PIG_OPTS="$PIG_OPTS -Dmapred.temp.dir=${MAPRED}

          /$pig_script/temp"
          export PIG_OPTS

          Show
          Andrew Tindle added a comment - Hi - I should have updated this earlier when I found this solution. You can use the following PIG_OPTS parameters to define the directories that you want the map/reduce jobs to use, prior to calling pig. -Dmapred.local.dir -Dmapred.output.dir -Dmapred.system.dir -Dmapred.temp.dir For example, I define them in my program to point to different directories based upon the pig script being called, as below: PIG_OPTS="-Dmapred.output.dir=$ {MAPRED}/$pig_script/output" PIG_OPTS="$PIG_OPTS -Dmapred.local.dir=${MAPRED} /$pig_script/local" PIG_OPTS="$PIG_OPTS -Dmapred.system.dir=$ {MAPRED}/$pig_script/system" PIG_OPTS="$PIG_OPTS -Dmapred.temp.dir=${MAPRED} /$pig_script/temp" export PIG_OPTS
          Hide
          Sandy Ryza added a comment -

          This occurs because the jobs are trying to use the same file to locally store their map outputs. They're both using the same directory taskTracker/user/jobcache/job_local_0001/attempt_local_0001_m_000000_0/output. This could be avoided by adding a timestamp component to local job ids? So the jobid would be something like job_local_123456789_0001 instead of job_local_0001.

          Show
          Sandy Ryza added a comment - This occurs because the jobs are trying to use the same file to locally store their map outputs. They're both using the same directory taskTracker/user/jobcache/job_local_0001/attempt_local_0001_m_000000_0/output. This could be avoided by adding a timestamp component to local job ids? So the jobid would be something like job_local_123456789_0001 instead of job_local_0001.
          Hide
          Andrew Tindle added a comment -

          Hi,

          I am also seeing this same problem when running 2 pig scripts concurrently in local mode. Did you ever receive a resolution?

          Thanks
          Andrew

          Show
          Andrew Tindle added a comment - Hi, I am also seeing this same problem when running 2 pig scripts concurrently in local mode. Did you ever receive a resolution? Thanks Andrew

            People

            • Assignee:
              Sandy Ryza
              Reporter:
              Araceli Henley
            • Votes:
              2 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development