Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-3214

JobInProgress.garbageCollect delete <job-dir> twice.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • None
    • None
    • None
    • None

    Description

      In JobInProgress.garbageCollect, the following codes delete <job-dir> twice.

            // JobClient always creates a new directory with job files
            // so we remove that directory to cleanup
            FileSystem fs = FileSystem.get(conf);
            fs.delete(new Path(profile.getJobFile()).getParent(), true);
              
            // Delete temp dfs dirs created if any, like in case of 
            // speculative exn of reduces.  
            Path tempDir = new Path(conf.getSystemDir(), jobId); 
            fs.delete(tempDir, true); 
      

      Below is the clean-up trace copied from HADOOP-3182:

      • FileSystem.delete <job-dir> by JobTracker as user_account
        at org.apache.hadoop.mapred.JobInProgress.garbageCollect(JobInProgress.java:1637)
        at org.apache.hadoop.mapred.JobInProgress.isJobComplete(JobInProgress.java:1396)
        at org.apache.hadoop.mapred.JobInProgress.completedTask(JobInProgress.java:1357)
        at org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.java:565)
        at org.apache.hadoop.mapred.JobTracker$TaskCommitQueue.run(JobTracker.java:2270)
        <job-dir> is obtained by profile.getJobFile().getParent()
      • FileSystem.delete <job-dir> again by JobTracker as user_account
        at org.apache.hadoop.mapred.JobInProgress.garbageCollect(JobInProgress.java:1642)
        at org.apache.hadoop.mapred.JobInProgress.isJobComplete(JobInProgress.java:1396)
        at org.apache.hadoop.mapred.JobInProgress.completedTask(JobInProgress.java:1357)
        at org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.java:565)
        at org.apache.hadoop.mapred.JobTracker$TaskCommitQueue.run(JobTracker.java:2270)
        <job-dir> is obtained by new Path(conf.getSystemDir(), jobId)

      Is there any case that these two paths are distinct?

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              szetszwo Tsz-wo Sze
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: