Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Duplicate
-
None
-
None
-
None
-
None
Description
In JobInProgress.garbageCollect, the following codes delete <job-dir> twice.
// JobClient always creates a new directory with job files // so we remove that directory to cleanup FileSystem fs = FileSystem.get(conf); fs.delete(new Path(profile.getJobFile()).getParent(), true); // Delete temp dfs dirs created if any, like in case of // speculative exn of reduces. Path tempDir = new Path(conf.getSystemDir(), jobId); fs.delete(tempDir, true);
Below is the clean-up trace copied from HADOOP-3182:
- FileSystem.delete <job-dir> by JobTracker as user_account
at org.apache.hadoop.mapred.JobInProgress.garbageCollect(JobInProgress.java:1637)
at org.apache.hadoop.mapred.JobInProgress.isJobComplete(JobInProgress.java:1396)
at org.apache.hadoop.mapred.JobInProgress.completedTask(JobInProgress.java:1357)
at org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.java:565)
at org.apache.hadoop.mapred.JobTracker$TaskCommitQueue.run(JobTracker.java:2270)
<job-dir> is obtained by profile.getJobFile().getParent()
- FileSystem.delete <job-dir> again by JobTracker as user_account
at org.apache.hadoop.mapred.JobInProgress.garbageCollect(JobInProgress.java:1642)
at org.apache.hadoop.mapred.JobInProgress.isJobComplete(JobInProgress.java:1396)
at org.apache.hadoop.mapred.JobInProgress.completedTask(JobInProgress.java:1357)
at org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.java:565)
at org.apache.hadoop.mapred.JobTracker$TaskCommitQueue.run(JobTracker.java:2270)
<job-dir> is obtained by new Path(conf.getSystemDir(), jobId)
Is there any case that these two paths are distinct?
Attachments
Issue Links
- is part of
-
HADOOP-3135 if the 'mapred.system.dir' in the client jobconf is different from the JobTracker's value job submission fails
- Closed