Hadoop Common
  1. Hadoop Common
  2. HADOOP-5022

[HOD] logcondense should delete all hod logs for a user, including jobtracker logs

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.21.0
    • Component/s: contrib/hod
    • Labels:
      None
    • Hadoop Flags:
      Incompatible change, Reviewed
    • Release Note:
      New logcondense option retain-master-logs indicates whether the script should delete master logs as part of its cleanup process. By default this option is false; master logs are deleted. Earlier versions of logcondense did not delete master logs.

      Description

      Currently, logcondense.py does not delete jobtracker logs that it uploads to the DFS when the HOD cluster is deallocated. This will result in the hod-logs directory to slowly accumulate a whole bunch of jobtracker logs. Particularly for users who run a lot of user jobs, this could fill up the namespace. Further these directories will cause the logcondense program to keep repeatedly looking at these directories stressing out the namenode. So, logcondense.py should optionally also delete the jobtracker logs.

      1. hadoop-5022.txt
        3 kB
        Peeyush Bishnoi
      2. hadoop-5022-1.txt
        4 kB
        Peeyush Bishnoi
      3. hadoop-5022-2.txt
        4 kB
        Peeyush Bishnoi
      4. hadoop-5022-3.txt
        4 kB
        Hemanth Yamijala

        Activity

        Hide
        Robert Chansler added a comment -

        Editorial pass over all release notes prior to publication of 0.21.

        Show
        Robert Chansler added a comment - Editorial pass over all release notes prior to publication of 0.21.
        Hide
        Hemanth Yamijala added a comment -

        I just committed this to trunk. Thanks, Peeyush !

        Show
        Hemanth Yamijala added a comment - I just committed this to trunk. Thanks, Peeyush !
        Hide
        Hemanth Yamijala added a comment -

        I made some very minor changes to the attached patch. The following:

        • modified the comments in the code a bit to better reflect the current algorithm
        • changed the name of the option to retain-master-logs instead of retain-masters-logs. Also updated the name in the documentation.
        • changed the path being deleted if retain-master-logs is false to remove the final / in the patch. This was unnecessary. I've tested this on my local box and it seems to work fine.

        test-patch results are as follows. Hod tests continue to be done manually, outside the unit test cycle of Hadoop.

        [exec] -1 overall.
        [exec]
        [exec] +1 @author. The patch does not contain any @author tags.
        [exec]
        [exec] -1 tests included. The patch doesn't appear to include any new or modified tests.
        [exec] Please justify why no tests are needed for this patch.
        [exec]
        [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
        [exec]
        [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
        [exec]
        [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
        [exec]
        [exec] +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

        Show
        Hemanth Yamijala added a comment - I made some very minor changes to the attached patch. The following: modified the comments in the code a bit to better reflect the current algorithm changed the name of the option to retain-master-logs instead of retain-masters-logs. Also updated the name in the documentation. changed the path being deleted if retain-master-logs is false to remove the final / in the patch. This was unnecessary. I've tested this on my local box and it seems to work fine. test-patch results are as follows. Hod tests continue to be done manually, outside the unit test cycle of Hadoop. [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] -1 tests included. The patch doesn't appear to include any new or modified tests. [exec] Please justify why no tests are needed for this patch. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
        Hide
        Vinod Kumar Vavilapalli added a comment -

        +1 for the changes in the patch.

        Show
        Vinod Kumar Vavilapalli added a comment - +1 for the changes in the patch.
        Hide
        Peeyush Bishnoi added a comment -

        With this new patch minor change has been done in logcondense.py "--retain-masters-logs" options 'help ' attribute .

        Also testing of the patch has been done.

        If options " r " or "retain-masters-logs" is set to 'true' then , log of masters ( JobTracker and Namenode if "dynamicdfs" is set ) , does not get removed . But If options " -r " or "-retain-masters-logs" is set to 'false' then complete job directory inside hod-logs in HDFS gets removed .

        Show
        Peeyush Bishnoi added a comment - With this new patch minor change has been done in logcondense.py "--retain-masters-logs" options 'help ' attribute . Also testing of the patch has been done. If options " r " or " retain-masters-logs" is set to 'true' then , log of masters ( JobTracker and Namenode if " dynamicdfs" is set ) , does not get removed . But If options " -r " or " -retain-masters-logs" is set to 'false' then complete job directory inside hod-logs in HDFS gets removed . —
        Hide
        Peeyush Bishnoi added a comment -

        Thanks! for the comments. I incorporated the changes in the patch as per the comments provided and uploaded the new patch . This new patch has "r " and "retain-masters-logs" as option that need to be passed with value either "true" or "false" while running the logcondense.py script . By default the option is set to 'false' . It means it will delete the complete job directory inside hod-logs in HDFS. But if option is set to 'true' it will delete only the tasktracker logs . It will delete the Datanode logs , if "-dynamicdfs" is 'true'.

        For example:
        python logcondense.py -p ~/hadoop-0.17.0/bin/hadoop -d 7 -c ~/hadoop-conf -l /user -r true

        Show
        Peeyush Bishnoi added a comment - Thanks! for the comments. I incorporated the changes in the patch as per the comments provided and uploaded the new patch . This new patch has " r " and " retain-masters-logs" as option that need to be passed with value either "true" or "false" while running the logcondense.py script . By default the option is set to 'false' . It means it will delete the complete job directory inside hod-logs in HDFS. But if option is set to 'true' it will delete only the tasktracker logs . It will delete the Datanode logs , if " -dynamicdfs" is 'true'. For example: python logcondense.py -p ~/hadoop-0.17.0/bin/hadoop -d 7 -c ~/hadoop-conf -l /user -r true —
        Hide
        Hemanth Yamijala added a comment -

        I looked at the patch. Some comments:

        • Agree with Vinod on the option's name. I think the default value can be 'false' and we can mark this as an incompatible change. This seems like a required behavior.
        • The code that's deleting the files seems to be incorrectly indented. Note that the cmd variable is overwritten while iterating over prefixes. This is not how the code was previously. Can you please check.

        Also, please do test this carefully.

        Show
        Hemanth Yamijala added a comment - I looked at the patch. Some comments: Agree with Vinod on the option's name. I think the default value can be 'false' and we can mark this as an incompatible change. This seems like a required behavior. The code that's deleting the files seems to be incorrectly indented. Note that the cmd variable is overwritten while iterating over prefixes. This is not how the code was previously. Can you please check. Also, please do test this carefully.
        Hide
        Vinod Kumar Vavilapalli added a comment -

        The patch looks good, seems to complete its requirements. I don't have a cluster set up with old logs, so couldn't test it live.

        Some minor code comments w.r.t logcondense.py:

        • Line +100 : the dictionary related to the newly added option can be indented. It won't give a compilation problem now, but it can be indented nevertheless for better readability
        • The option name can be something in the lines of "retain-masters-logs" instead of "cleanall". It should be true by default (current scenario). The description string can be something like "true if the logs of the masters(jobtracker and namenode if dynamicdfs is set) have to be retained, false if everything has to be removed"
        • Line +191 : ret = 0 not needed
        • Line +135 : spurious blank line to be removed

        Another point to be noted is that the code path covered by this patch could have been much simpler had the job-specific directories(present in hod-logs parent directory) themselves had timestamp in their names. If that were the case, the output needed from dfs as well as the number of filenames processed would have been much lesser. But, this needs changes in hod and so the fix in the provided patch should suffice for now.

        Show
        Vinod Kumar Vavilapalli added a comment - The patch looks good, seems to complete its requirements. I don't have a cluster set up with old logs, so couldn't test it live. Some minor code comments w.r.t logcondense.py: Line +100 : the dictionary related to the newly added option can be indented. It won't give a compilation problem now, but it can be indented nevertheless for better readability The option name can be something in the lines of "retain-masters-logs" instead of "cleanall". It should be true by default (current scenario). The description string can be something like "true if the logs of the masters(jobtracker and namenode if dynamicdfs is set) have to be retained, false if everything has to be removed" Line +191 : ret = 0 not needed Line +135 : spurious blank line to be removed Another point to be noted is that the code path covered by this patch could have been much simpler had the job-specific directories(present in hod-logs parent directory) themselves had timestamp in their names. If that were the case, the output needed from dfs as well as the number of filenames processed would have been much lesser. But, this needs changes in hod and so the fix in the provided patch should suffice for now.
        Hide
        Peeyush Bishnoi added a comment -

        Patch is attached for logcondense.py that will optionally delete the JobTracker logs and also update the logcondense.py documentation . In fact the patch will delete the complete job directory inside hod-logs in DFS if option "a" or "-all" is set to 'true' .

        TaskTracker logs gets deleted if option "a" or "-all" is set to 'false' or if option is not set . By default option is set to 'false'.

        For example:
        python logcondense.py -p ~/hadoop-0.17.0/bin/hadoop -d 7 -c ~/hadoop-conf -l /user -a true

        Show
        Peeyush Bishnoi added a comment - Patch is attached for logcondense.py that will optionally delete the JobTracker logs and also update the logcondense.py documentation . In fact the patch will delete the complete job directory inside hod-logs in DFS if option " a" or " -all" is set to 'true' . TaskTracker logs gets deleted if option " a" or " -all" is set to 'false' or if option is not set . By default option is set to 'false'. For example: python logcondense.py -p ~/hadoop-0.17.0/bin/hadoop -d 7 -c ~/hadoop-conf -l /user -a true —
        Hide
        Marco Nicosia added a comment -

        Thanks Hemanth. This is having a big impact on our running Hadoop clusters, so marking as blocker for 0.18.3.

        Show
        Marco Nicosia added a comment - Thanks Hemanth. This is having a big impact on our running Hadoop clusters, so marking as blocker for 0.18.3.

          People

          • Assignee:
            Peeyush Bishnoi
            Reporter:
            Hemanth Yamijala
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development