Hadoop Common
  1. Hadoop Common
  2. HADOOP-3336

Direct a subset of namenode RPC events for audit logging

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.18.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Added a log4j appender that emits events from FSNamesystem for audit logging

      Description

      A non-persistent transaction log will permit managers of HDFS installations to monitor and reconstruct user activity in HDFS for forensic analysis and maintenance.

      1. 3336-0.patch
        5 kB
        Chris Douglas
      2. 3336-1.patch
        5 kB
        Chris Douglas
      3. 3336-2.patch
        6 kB
        Chris Douglas

        Issue Links

          Activity

          Hide
          Hudson added a comment -
          Show
          Hudson added a comment - Integrated in Hadoop-trunk #493 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/493/ )
          Hide
          Chris Douglas added a comment -

          I just committed this.

          Show
          Chris Douglas added a comment - I just committed this.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12382002/3336-2.patch
          against trunk revision 655984.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no tests are needed for this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2458/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2458/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2458/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2458/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12382002/3336-2.patch against trunk revision 655984. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2458/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2458/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2458/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2458/console This message is automatically generated.
          Hide
          Chris Douglas added a comment -

          I talked with Allen, and he suggested that- particularly for creates- that the resulting owner, group, and permission be printed rather than only that which was supplied in the RPC.

          Show
          Chris Douglas added a comment - I talked with Allen, and he suggested that- particularly for creates- that the resulting owner, group, and permission be printed rather than only that which was supplied in the RPC.
          Hide
          dhruba borthakur added a comment -

          +1.

          Show
          dhruba borthakur added a comment - +1.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12381915/3336-1.patch
          against trunk revision 655674.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no tests are needed for this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2453/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2453/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2453/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2453/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12381915/3336-1.patch against trunk revision 655674. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2453/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2453/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2453/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2453/console This message is automatically generated.
          Hide
          Chris Douglas added a comment -

          Nevermind; I misunderstood your original comment. This implements the change Dhruba was suggesting.

          Show
          Chris Douglas added a comment - Nevermind; I misunderstood your original comment. This implements the change Dhruba was suggesting.
          Hide
          Chris Douglas added a comment -

          I agree with Dhruba; calling toString on the UGI causes the logging to dig too deeply into unrelated code. The patch contains some scratch code that needs to be removed, too.

          Show
          Chris Douglas added a comment - I agree with Dhruba; calling toString on the UGI causes the logging to dig too deeply into unrelated code. The patch contains some scratch code that needs to be removed, too.
          Hide
          dhruba borthakur added a comment -

          The current patch assumes that the first column is the ugi, the second column is the permissions, the third column is the command, etc,etc. We can make the format a wee-bit forward compatible if we have it of the form

          "ugi=xxx\tperm=xxx\tcmd=xxx"

          Then, an application parsing this log output can determine whether the column it is looking for exists or not.

          Show
          dhruba borthakur added a comment - The current patch assumes that the first column is the ugi, the second column is the permissions, the third column is the command, etc,etc. We can make the format a wee-bit forward compatible if we have it of the form "ugi=xxx\tperm=xxx\tcmd=xxx" Then, an application parsing this log output can determine whether the column it is looking for exists or not.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12381725/3336-0.patch
          against trunk revision 654315.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no tests are needed for this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2435/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2435/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2435/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2435/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12381725/3336-0.patch against trunk revision 654315. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2435/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2435/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2435/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2435/console This message is automatically generated.
          Hide
          Chris Douglas added a comment -

          if we plan to use this File Change Log for backup and/or remote mirroring, then its constraints will have to be more stringent

          Agreed.

          This patch exposes a general audit logging API through the commons logging interface. All audit events are at INFO level.

          Show
          Chris Douglas added a comment - if we plan to use this File Change Log for backup and/or remote mirroring, then its constraints will have to be more stringent Agreed. This patch exposes a general audit logging API through the commons logging interface. All audit events are at INFO level.
          Hide
          dhruba borthakur added a comment -

          This is a File Change Log http://en.wikipedia.org/wiki/File_Change_Log . It can be used for audit purposes. It can also be used for backup and/or synchornization with remote mirror sites.

          I agree with Chris that for auditing purposes, it might be sufficient to not persist this log. However, if we plan to use this File Change Log for backup and/or remote mirroring, then its constraints will have to be more stringent.

          Show
          dhruba borthakur added a comment - This is a File Change Log http://en.wikipedia.org/wiki/File_Change_Log . It can be used for audit purposes. It can also be used for backup and/or synchornization with remote mirror sites. I agree with Chris that for auditing purposes, it might be sufficient to not persist this log. However, if we plan to use this File Change Log for backup and/or remote mirroring, then its constraints will have to be more stringent.
          Hide
          Chris Douglas added a comment -

          The easiest way to implement this will be by adding a log4j appender that emits events from FSNamesystem. This way, it can be turned off by default but enabled/configured by administrators. The subset of events should probably be restricted to those mapped to DFSClient calls. As a first pass: create (startFile), mkdirs, setOwner, setPermission, delete, rename, open (getBlockLocations?), getFileStatus, setReplication, and listStatus all look like reasonable events to log. For all events, the ugi and path will be logged (date/time, etc. should be handled by the appender). For create, mkdirs, setOwner, and setPermission, both the ugi and the FsPermission information will be logged.

          Thoughts? This isn't designed to be a secure audit log- and I'm sure issues like HADOOP-1741 will affect the approach to future audit logging- but it should provide sufficient information for administrators to manage HDFS.

          Show
          Chris Douglas added a comment - The easiest way to implement this will be by adding a log4j appender that emits events from FSNamesystem. This way, it can be turned off by default but enabled/configured by administrators. The subset of events should probably be restricted to those mapped to DFSClient calls. As a first pass: create (startFile), mkdirs, setOwner, setPermission, delete, rename, open (getBlockLocations?), getFileStatus, setReplication, and listStatus all look like reasonable events to log. For all events, the ugi and path will be logged (date/time, etc. should be handled by the appender). For create, mkdirs, setOwner, and setPermission, both the ugi and the FsPermission information will be logged. Thoughts? This isn't designed to be a secure audit log- and I'm sure issues like HADOOP-1741 will affect the approach to future audit logging- but it should provide sufficient information for administrators to manage HDFS.

            People

            • Assignee:
              Chris Douglas
              Reporter:
              Chris Douglas
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development