Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-4643

Make job-history cleanup-period configurable

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.0.3
    • Fix Version/s: 1.2.0
    • Component/s: jobhistoryserver
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Job history cleanup should be made configurable. Currently, it is set to 1 month by default. The DEBUG_MODE (to be removed, see MAPREDUCE-4629) sets it to 20 minutes, but it should be configurable.

      1. MAPREDUCE-4643.patch
        8 kB
        Sandy Ryza
      2. MAPREDUCE-4643-branch1.patch
        11 kB
        Sandy Ryza
      3. MAPREDUCE-4643-branch1.patch
        8 kB
        Sandy Ryza
      4. MAPREDUCE-4643-branch1-2.patch
        9 kB
        Sandy Ryza
      5. MAPREDUCE-4643-branch-1-3.patch
        12 kB
        Sandy Ryza
      6. MAPREDUCE-4643-branch-1-4.patch
        9 kB
        Sandy Ryza
      7. MAPREDUCE-4643-branch-1-5.patch
        11 kB
        Sandy Ryza
      8. MAPREDUCE-4643-branch-1-6.patch
        12 kB
        Sandy Ryza
      9. MAPREDUCE-4643-branch-1-addendum.patch
        2 kB
        Sandy Ryza

        Issue Links

          Activity

          Hide
          Matt Foley added a comment -

          Closed upon release of Hadoop 1.2.0.

          Show
          Matt Foley added a comment - Closed upon release of Hadoop 1.2.0.
          Hide
          Alejandro Abdelnur added a comment -

          Thanks for following up Sandy, committed addendum to branch-1.

          Show
          Alejandro Abdelnur added a comment - Thanks for following up Sandy, committed addendum to branch-1.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12569119/MAPREDUCE-4643-branch-1-addendum.patch
          against trunk revision .

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3330//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12569119/MAPREDUCE-4643-branch-1-addendum.patch against trunk revision . -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3330//console This message is automatically generated.
          Hide
          Sandy Ryza added a comment -

          The test appears to be failing occasionally due to time zone issues. Uploading an addendum patch that fixes the failure.

          Show
          Sandy Ryza added a comment - The test appears to be failing occasionally due to time zone issues. Uploading an addendum patch that fixes the failure.
          Hide
          Alejandro Abdelnur added a comment -

          Thanks Sandy. Committed to trunk.

          Show
          Alejandro Abdelnur added a comment - Thanks Sandy. Committed to trunk.
          Hide
          Alejandro Abdelnur added a comment -

          +1. did a clean build and run relevant test.

          Show
          Alejandro Abdelnur added a comment - +1. did a clean build and run relevant test.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12568649/MAPREDUCE-4643-branch-1-6.patch
          against trunk revision .

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3320//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12568649/MAPREDUCE-4643-branch-1-6.patch against trunk revision . -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3320//console This message is automatically generated.
          Hide
          Sandy Ryza added a comment -

          Latest patch renames run and tests the case where only some of the files within a directory need to be deleted.

          Show
          Sandy Ryza added a comment - Latest patch renames run and tests the case where only some of the files within a directory need to be deleted.
          Hide
          Karthik Kambatla added a comment -

          Cool test, Sandy. The latest patch looks mostly good; minor comments:

          1. We should rename run(long now) to runCleaner(long now) to avoid confusion.
          2. The test doesn't seem to test the case where only some of the files within a directory need to be deleted. To add this case to the test, we might have to calculate the paths etc. instead of hardcoding them.
          Show
          Karthik Kambatla added a comment - Cool test, Sandy. The latest patch looks mostly good; minor comments: We should rename run(long now) to runCleaner(long now) to avoid confusion. The test doesn't seem to test the case where only some of the files within a directory need to be deleted. To add this case to the test, we might have to calculate the paths etc. instead of hardcoding them.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12568512/MAPREDUCE-4643-branch-1-5.patch
          against trunk revision .

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3319//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12568512/MAPREDUCE-4643-branch-1-5.patch against trunk revision . -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3319//console This message is automatically generated.
          Hide
          Sandy Ryza added a comment -

          Latest patch adds test that runs in less than a second.

          Show
          Sandy Ryza added a comment - Latest patch adds test that runs in less than a second.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12568349/MAPREDUCE-4643-branch-1-4.patch
          against trunk revision .

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3315//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12568349/MAPREDUCE-4643-branch-1-4.patch against trunk revision . -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3315//console This message is automatically generated.
          Hide
          Sandy Ryza added a comment -

          Latest patch adds the default values into mapred-default.xml and takes out the test, which will be covered in MAPREDUCE-4676.

          Show
          Sandy Ryza added a comment - Latest patch adds the default values into mapred-default.xml and takes out the test, which will be covered in MAPREDUCE-4676 .
          Hide
          Alejandro Abdelnur added a comment -

          patch looks good, 2 things though:

          • mapred-default.xml should have the default values as values (instead of empty)
          • The TestJobHistory test takes 333secs, this seems a bit too much.
          Show
          Alejandro Abdelnur added a comment - patch looks good, 2 things though: mapred-default.xml should have the default values as values (instead of empty) The TestJobHistory test takes 333secs, this seems a bit too much.
          Hide
          Karthik Kambatla added a comment -

          +1. Thanks Sandy - the latest patch looks good to me.

          Show
          Karthik Kambatla added a comment - +1. Thanks Sandy - the latest patch looks good to me.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12568109/MAPREDUCE-4643-branch-1-3.patch
          against trunk revision .

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3307//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12568109/MAPREDUCE-4643-branch-1-3.patch against trunk revision . -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3307//console This message is automatically generated.
          Hide
          Sandy Ryza added a comment -

          uploaded refresh patch that includes test

          Show
          Sandy Ryza added a comment - uploaded refresh patch that includes test
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12548332/MAPREDUCE-4643-branch1-2.patch
          against trunk revision .

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2918//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12548332/MAPREDUCE-4643-branch1-2.patch against trunk revision . -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2918//console This message is automatically generated.
          Hide
          Sandy Ryza added a comment -

          I uploaded a patch that makes that revision.

          Show
          Sandy Ryza added a comment - I uploaded a patch that makes that revision.
          Hide
          Alejandro Abdelnur added a comment -

          oh well. looks good, one small NIT though

          +    return result.getTimeInMillis() - 
          +        result.getTimeInMillis() % HistoryCleaner.ONE_DAY_IN_MS;
          

          Assign getTimeInMillis() to a local var and then use that for the expression above.

          Show
          Alejandro Abdelnur added a comment - oh well. looks good, one small NIT though + return result.getTimeInMillis() - + result.getTimeInMillis() % HistoryCleaner.ONE_DAY_IN_MS; Assign getTimeInMillis() to a local var and then use that for the expression above.
          Hide
          Sandy Ryza added a comment -

          trunk already uses milliseconds. Should branch-1 be different?

          Show
          Sandy Ryza added a comment - trunk already uses milliseconds. Should branch-1 be different?
          Hide
          Alejandro Abdelnur added a comment -

          using ms as unit for a value that will normally be in days seems a bit off. I understand that for testing purposes we want something short enough, can we use seconds instead? (saving 3 digits). Ideally we should be able to specify units on the time interval (ie '2ms', '2d'), but this is completely out of scope of this JIRA. Thx

          Show
          Alejandro Abdelnur added a comment - using ms as unit for a value that will normally be in days seems a bit off. I understand that for testing purposes we want something short enough, can we use seconds instead? (saving 3 digits). Ideally we should be able to specify units on the time interval (ie '2ms', '2d'), but this is completely out of scope of this JIRA. Thx
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12546557/MAPREDUCE-4643-branch1.patch
          against trunk revision .

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2877//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12546557/MAPREDUCE-4643-branch1.patch against trunk revision . -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2877//console This message is automatically generated.
          Hide
          Sandy Ryza added a comment -

          Uploaded a patch that fixes the comments and includes the test

          Show
          Sandy Ryza added a comment - Uploaded a patch that fixes the comments and includes the test
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12546425/MAPREDUCE-4643-branch1.patch
          against trunk revision .

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2875//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12546425/MAPREDUCE-4643-branch1.patch against trunk revision . -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2875//console This message is automatically generated.
          Hide
          Karthik Kambatla added a comment -

          Thanks for the patch, Sandy. It looks like it should work.

          Minor comments:

          1. In JobHistory.directoryTime(), the initial comments do not apply anymore and can be removed
          2. Do you want to add a test to verify the timely deletion of history files? (Now, I see that this test would fix MAPREDUCE-4676 as well). Do you think you can add a test here and mark the other one duplicate?
          Show
          Karthik Kambatla added a comment - Thanks for the patch, Sandy. It looks like it should work. Minor comments: In JobHistory.directoryTime() , the initial comments do not apply anymore and can be removed Do you want to add a test to verify the timely deletion of history files? (Now, I see that this test would fix MAPREDUCE-4676 as well). Do you think you can add a test here and mark the other one duplicate?
          Hide
          Sandy Ryza added a comment -

          I spoke with Karthik and we resolved that the patch does work for smaller ages, but that for directories older than a day, it could work more efficiently by deleting the entire directory.

          Show
          Sandy Ryza added a comment - I spoke with Karthik and we resolved that the patch does work for smaller ages, but that for directories older than a day, it could work more efficiently by deleting the entire directory.
          Hide
          Sandy Ryza added a comment -

          It should work for significantly smaller ages as well. It uses the directory names to decide which to look in, but checks the modification times on the files inside them.

          Show
          Sandy Ryza added a comment - It should work for significantly smaller ages as well. It uses the directory names to decide which to look in, but checks the modification times on the files inside them.
          Hide
          Karthik Kambatla added a comment -

          Thanks for your patch, Sandy.

          The patch looks like it should work for (max-age > 1 day). How do we handle significantly smaller ages? We should probably address it the same it is addressed in trunk - use the modification time? Thoughts?

          Show
          Karthik Kambatla added a comment - Thanks for your patch, Sandy. The patch looks like it should work for (max-age > 1 day). How do we handle significantly smaller ages? We should probably address it the same it is addressed in trunk - use the modification time? Thoughts?
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12546397/MAPREDUCE-4643.patch
          against trunk revision .

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2874//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12546397/MAPREDUCE-4643.patch against trunk revision . -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2874//console This message is automatically generated.
          Hide
          Sandy Ryza added a comment -

          Ah, didn't see this JIRA until now. The patch I'm working on for MAPREDUCE-4676 leaves the directory structure the same and deletes based on the modification time (only checking files in old enough directories). For the testJobHistoryCleaner test, it would be ideal for it to be configurable to milliseconds. What do you think?

          Show
          Sandy Ryza added a comment - Ah, didn't see this JIRA until now. The patch I'm working on for MAPREDUCE-4676 leaves the directory structure the same and deletes based on the modification time (only checking files in old enough directories). For the testJobHistoryCleaner test, it would be ideal for it to be configurable to milliseconds. What do you think?
          Hide
          Karthik Kambatla added a comment -

          The two aspects of making the job-history cleanup period configurable seem to be:

          1. unit of time to configure it on (minutes vs days)
          2. directory naming to reflect the creation time

          For the directory naming, we should be able to leave it as yyyy/mm/dd format. If the cleanup period is in hours/minutes, we can read the file modification time and use that to do the comparison. Trunk currently uses the modification time even for higher values of cleanup period (days) and has a TODO note that we should use the directory structure instead to reduce the load on HDFS.

          With regards to the unit of time for configuration, minutes seems to be the better option.

          Thoughts?

          Show
          Karthik Kambatla added a comment - The two aspects of making the job-history cleanup period configurable seem to be: unit of time to configure it on (minutes vs days) directory naming to reflect the creation time For the directory naming, we should be able to leave it as yyyy/mm/dd format. If the cleanup period is in hours/minutes, we can read the file modification time and use that to do the comparison. Trunk currently uses the modification time even for higher values of cleanup period (days) and has a TODO note that we should use the directory structure instead to reduce the load on HDFS. With regards to the unit of time for configuration, minutes seems to be the better option. Thoughts?

            People

            • Assignee:
              Sandy Ryza
              Reporter:
              Karthik Kambatla
            • Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development