Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-5158

Cleanup required when mapreduce.job.restart.recover is set to false

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.2.0
    • Fix Version/s: 1.2.0
    • Component/s: jobtracker
    • Labels:
      None

      Description

      When mapred.jobtracker.restart.recover is set as true and mapreduce.job.restart.recover is set to false for a MR job, Job clean up never happens for that job if JT restarts while job is running.

      .staging and job-info file for that job remains on HDFS forever.

      1. MAPREDUCE-5158-br1.patch
        4 kB
        Mayank Bansal
      2. MAPREDUCE-5158-br1-1.patch
        2 kB
        Mayank Bansal

        Activity

        Yesha Vora created issue -
        Hide
        Arun C Murthy added a comment -

        Good catch Yesha, thanks for filing this!

        Show
        Arun C Murthy added a comment - Good catch Yesha, thanks for filing this!
        Mayank Bansal made changes -
        Field Original Value New Value
        Assignee Mayank Bansal [ mayank_bansal ]
        Hide
        Arun C Murthy added a comment -

        Thanks for taking this up Mayank Bansal! Please LMK if you need any help, I'd love to help get this in ASAP. Thanks!

        Show
        Arun C Murthy added a comment - Thanks for taking this up Mayank Bansal ! Please LMK if you need any help, I'd love to help get this in ASAP. Thanks!
        Hide
        Mayank Bansal added a comment -

        Adding patch

        Thanks,
        Mayank

        Show
        Mayank Bansal added a comment - Adding patch Thanks, Mayank
        Mayank Bansal made changes -
        Attachment MAPREDUCE-5158-br1.patch [ 12580650 ]
        Hide
        Arun C Murthy added a comment -

        Mayank, thanks for the patch!

        I like your idea of using JIP.garbageCollect, but the more I look into it, the more I'm worried about JIP.garbageCollect. This is because it does a whole lot of other cleanup (metrics, JT.finalizeJob, JobHistory cleanup etc. etc.) which aren't implemented defensively enough. For e.g. JobHistory.logSubmitted is called in JIP.initTasks, but JT.finalizeJob calls JobHistory.markCompleted which breaks stuff badly.

        So, I propose a simpler solution: let's move the code in garbageCollect which does cleanup of:
        localJobFile, job-system-dir, DelegationTokenRenewal.removeDelegationTokenRenewalForJob and fs.close

        Then JIP.garbageCollect can use that too.

        Thoughts?

        Show
        Arun C Murthy added a comment - Mayank, thanks for the patch! I like your idea of using JIP.garbageCollect, but the more I look into it, the more I'm worried about JIP.garbageCollect. This is because it does a whole lot of other cleanup (metrics, JT.finalizeJob, JobHistory cleanup etc. etc.) which aren't implemented defensively enough. For e.g. JobHistory.logSubmitted is called in JIP.initTasks, but JT.finalizeJob calls JobHistory.markCompleted which breaks stuff badly. So, I propose a simpler solution: let's move the code in garbageCollect which does cleanup of: localJobFile, job-system-dir, DelegationTokenRenewal.removeDelegationTokenRenewalForJob and fs.close Then JIP.garbageCollect can use that too. Thoughts?
        Hide
        Mayank Bansal added a comment -

        Thanks Arun for your comments.I agree with your approach.

        WIll update the latest patch.

        Thanks,
        Mayank

        Show
        Mayank Bansal added a comment - Thanks Arun for your comments.I agree with your approach. WIll update the latest patch. Thanks, Mayank
        Hide
        Mayank Bansal added a comment -

        Attaching updated patch.

        Thanks,
        Mayank

        Show
        Mayank Bansal added a comment - Attaching updated patch. Thanks, Mayank
        Mayank Bansal made changes -
        Attachment MAPREDUCE-5158-br1-1.patch [ 12580765 ]
        Hide
        Arun C Murthy added a comment -

        I just committed this after running affected tests. Thanks Mayank!

        Show
        Arun C Murthy added a comment - I just committed this after running affected tests. Thanks Mayank!
        Arun C Murthy made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Fix Version/s 1.2.0 [ 12321661 ]
        Resolution Fixed [ 1 ]
        Hide
        Matt Foley added a comment -

        Closed upon release of Hadoop 1.2.0.

        Show
        Matt Foley added a comment - Closed upon release of Hadoop 1.2.0.
        Matt Foley made changes -
        Status Resolved [ 5 ] Closed [ 6 ]

          People

          • Assignee:
            Mayank Bansal
            Reporter:
            Yesha Vora
          • Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development