Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
0.20.2
-
None
-
None
Description
On the same (build) machine, there may be multiple instances of same local job running - e.g. same unit test from snapshot build and release build.
For each build project on our build machine, there is environment variable with unique value defined.
In JobClient.submitJobInternal(), there is following code:
JobID jobId = jobSubmitClient.getNewJobId();
Path submitJobDir = new Path(getSystemDir(), jobId.toString());
The above code doesn't handle the scenario described previously and often leads to the following failure:
Caused by: org.apache.hadoop.util.Shell$ExitCodeException: chmod: cannot access `/tmp/hadoop-build/mapred/system/job_local_0002': No such file or directory
at org.apache.hadoop.util.Shell.runCommand(Shell.java:195)
at org.apache.hadoop.util.Shell.run(Shell.java:134)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:286)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:354)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:337)
at org.apache.hadoop.fs.RawLocalFileSystem.execCommand(RawLocalFileSystem.java:492)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:484)
at org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.java:286)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:308)
at org.apache.hadoop.mapred.JobClient.configureCommandLineOptions(JobClient.java:614)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:802)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:771)
at org.apache.hadoop.mapred.HadoopClient.runJob(HadoopClient.java:177)
One solution would be to incorporate the value of the underlying environment variable into either NewJobId or SystemDir so that there is no conflict.