Hadoop Common
  1. Hadoop Common
  2. HADOOP-5042

Add expiration handling to the chukwa log4j appender

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.21.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Chukwwa Log4J appender options allow a retention policy to limit number of files.

      Description

      Chukwa log4j appender is not doing any sort of cleanup.
      The idea here is to keep only n rotate files and delete the older ones.
      This way we don't have to worry about manually cleaning old files

      1. HADOOP-5042-2.patch
        5 kB
        Jerome Boulon
      2. HADOOP-5042.patch
        5 kB
        Jerome Boulon

        Issue Links

          Activity

          Tom White made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Robert Chansler made changes -
          Release Note Chukwwa Log4J appender options allow a retention policy to limit number of files.
          Hide
          Robert Chansler added a comment -

          Editorial pass over all release notes prior to publication of 0.21.

          Show
          Robert Chansler added a comment - Editorial pass over all release notes prior to publication of 0.21.
          Owen O'Malley made changes -
          Component/s contrib/chukwa [ 12312445 ]
          Hide
          Hudson added a comment -
          Show
          Hudson added a comment - Integrated in Hadoop-trunk #766 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/766/ )
          Jerome Boulon made changes -
          Description Chukwa log4j appender is not doing any sort of cleanup.
          The idea here is to keep only n rotate files and delete the older ones.
          This way we don't have to worry about manually cleaning old files
          Chris Douglas made changes -
          Hadoop Flags [Reviewed]
          Fix Version/s 0.21.0 [ 12313563 ]
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Resolution Fixed [ 1 ]
          Hide
          Chris Douglas added a comment -

          I committed this. Thanks, Jerome

          Show
          Chris Douglas added a comment - I committed this. Thanks, Jerome
          Hide
          Ari Rabkin added a comment -

          Patch looks fine. +1

          Show
          Ari Rabkin added a comment - Patch looks fine. +1
          Hide
          Jerome Boulon added a comment -

          -1 core tests. The patch failed core unit tests. is not related to that patch

          Show
          Jerome Boulon added a comment - -1 core tests. The patch failed core unit tests. is not related to that patch
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12400634/HADOOP-5042-2.patch
          against trunk revision 746340.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no tests are needed for this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3898/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3898/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3898/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3898/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12400634/HADOOP-5042-2.patch against trunk revision 746340. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3898/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3898/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3898/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3898/console This message is automatically generated.
          Jerome Boulon made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Jerome Boulon made changes -
          Attachment HADOOP-5042-2.patch [ 12400634 ]
          Hide
          Jerome Boulon added a comment -

          new patch

          Show
          Jerome Boulon added a comment - new patch
          Jerome Boulon made changes -
          Attachment HADOOP-5042-2.patch [ 12400632 ]
          Jerome Boulon made changes -
          Comment [ new patch ]
          Jerome Boulon made changes -
          Attachment HADOOP-5042-2.patch [ 12400632 ]
          Chris Douglas made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Hide
          Chris Douglas added a comment -

          The new patch still adds @author tags...

          Show
          Chris Douglas added a comment - The new patch still adds @author tags...
          Hide
          Jerome Boulon added a comment -

          Junit failed because of HADOOP-5138

          Show
          Jerome Boulon added a comment - Junit failed because of HADOOP-5138
          Jerome Boulon made changes -
          Link This issue is blocked by HADOOP-5138 [ HADOOP-5138 ]
          Jerome Boulon made changes -
          Link This issue blocks HADOOP-5058 [ HADOOP-5058 ]
          Eric Yang made changes -
          Attachment HADOOP-5042.patch [ 12397996 ]
          Jerome Boulon made changes -
          Attachment HADOOP-5042.patch [ 12397996 ]
          Hide
          Jerome Boulon added a comment -

          Revise to take feedback into account + add one more check to exclude current log file from the list

          Show
          Jerome Boulon added a comment - Revise to take feedback into account + add one more check to exclude current log file from the list
          Hide
          Jerome Boulon added a comment -
          • the @Author tags were already there in my version (Revision 685353 ), I will remove them
          • the sort is out of any loop and it's done only once per rotate period so it should not cause any performance issue
          • I will put a check on $filename before applying the regex to take into account Ari's feedback regarding the security issue
          Show
          Jerome Boulon added a comment - the @Author tags were already there in my version (Revision 685353 ), I will remove them the sort is out of any loop and it's done only once per rotate period so it should not cause any performance issue I will put a check on $filename before applying the regex to take into account Ari's feedback regarding the security issue
          Hide
          Eric Yang added a comment -

          It may be better to match files starting with the same name, and stat them and delete oldest file(s) up the maxBackupIndex specified number. It's cheaper to do file stat than regex filename matching.

          Show
          Eric Yang added a comment - It may be better to match files starting with the same name, and stat them and delete oldest file(s) up the maxBackupIndex specified number. It's cheaper to do file stat than regex filename matching.
          Hide
          Ari Rabkin added a comment -

          Couple things I'm nervous about.
          1) This patch introduces @author tags. This conflicts with Hadoop coding standards. Is there a reason for adding them?
          2) You do a lexical sort to figure out which rotated files are old. Is this safe in context?
          3) It's sort of icky to repeatedly remove the first element from an array – the remove is O(N) each time. That''s probably okay here, since the array never gets big and we don't run this very often. But it might be nice to leave a comment explaining why this is okay.
          4) This might not bother us, but there's a little bit of a security/correctness issue here: The pattern can easily match files that don't contain $fileName – just do something like "$filename | .". And we don't check for that. A fix would be to first match against .$filename.* before applying the user's regex.

          Show
          Ari Rabkin added a comment - Couple things I'm nervous about. 1) This patch introduces @author tags. This conflicts with Hadoop coding standards. Is there a reason for adding them? 2) You do a lexical sort to figure out which rotated files are old. Is this safe in context? 3) It's sort of icky to repeatedly remove the first element from an array – the remove is O(N) each time. That''s probably okay here, since the array never gets big and we don't run this very often. But it might be nice to leave a comment explaining why this is okay. 4) This might not bother us, but there's a little bit of a security/correctness issue here: The pattern can easily match files that don't contain $fileName – just do something like "$filename | . ". And we don't check for that. A fix would be to first match against . $filename.* before applying the user's regex.
          Jerome Boulon made changes -
          Status In Progress [ 3 ] Patch Available [ 10002 ]
          Jerome Boulon made changes -
          Attachment HADOOP-5042.patch [ 12397920 ]
          Hide
          Jerome Boulon added a comment -

          Adding these 2 parameters to the log4j properties file, will allow the Chukwalog4JAppender to automatically do some cleanup on the log file

          log4j.appender.R.maxBackupIndex=3
          log4j.appender.R.cleanUpRegex=$fileName.20*

          Show
          Jerome Boulon added a comment - Adding these 2 parameters to the log4j properties file, will allow the Chukwalog4JAppender to automatically do some cleanup on the log file log4j.appender.R.maxBackupIndex=3 log4j.appender.R.cleanUpRegex=$fileName.20*
          Jerome Boulon made changes -
          Status Open [ 1 ] In Progress [ 3 ]
          Jerome Boulon made changes -
          Field Original Value New Value
          Assignee Jerome Boulon [ jboulon ]
          Jerome Boulon created issue -

            People

            • Assignee:
              Jerome Boulon
              Reporter:
              Jerome Boulon
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development