Hadoop Common
  1. Hadoop Common
  2. HADOOP-6859

Introduce additional statistics to FileSystem

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.22.0
    • Fix Version/s: 0.22.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Currently FileSystem#statistics tracks bytesRead and bytesWritten. Additional statistics that gives summary of operations performed will be useful for tracking file system use.

      1. HADOOP-6859.y20.patch
        3 kB
        Suresh Srinivas
      2. HADOOP-6859.patch
        3 kB
        Suresh Srinivas

        Issue Links

          Activity

          Hide
          Suresh Srinivas added a comment -

          I am planning to introduce the following additional statistics, that is accumulated at the client per file system as it is done now.

          1. read operations - number of read operations such as listStatus, getFileBlockLocations, open etc.
          2. write operations - number of write operations such as create, append, setPermission etc.
          3. large read operations - on file system, most of the operations are small except listFiles for a large directory. Iterative listFiles was introduced in HDFS to break down a single large operation into smaller steps. This counter is incremented is incremented for every iteration of listFiles, when listing files under a large directory.

          These statistics are collected in job history for analysis of how HDFS is loaded by map reduce tasks. This is useful in the interim to identify jobs that heavily load HDFS. In future this could also be used to throttle the load at the map reduce framework.

          Show
          Suresh Srinivas added a comment - I am planning to introduce the following additional statistics, that is accumulated at the client per file system as it is done now. read operations - number of read operations such as listStatus, getFileBlockLocations, open etc. write operations - number of write operations such as create, append, setPermission etc. large read operations - on file system, most of the operations are small except listFiles for a large directory. Iterative listFiles was introduced in HDFS to break down a single large operation into smaller steps. This counter is incremented is incremented for every iteration of listFiles, when listing files under a large directory. These statistics are collected in job history for analysis of how HDFS is loaded by map reduce tasks. This is useful in the interim to identify jobs that heavily load HDFS. In future this could also be used to throttle the load at the map reduce framework.
          Hide
          Suresh Srinivas added a comment -

          Attached path adds new statistics.

          Show
          Suresh Srinivas added a comment - Attached path adds new statistics.
          Hide
          Philip Zeyliger added a comment -

          Patch looks fine. You might want to mention in the javadoc what you mentioned here in the JIRA about what a "large read" operation may be.

          Show
          Philip Zeyliger added a comment - Patch looks fine. You might want to mention in the javadoc what you mentioned here in the JIRA about what a "large read" operation may be.
          Hide
          Suresh Srinivas added a comment -

          Large read operation is left to the file system implementation. I am explaining that in the context of HDFS in this jira. I have it also documented in javadoc for getLargeReadOps() method.

          Show
          Suresh Srinivas added a comment - Large read operation is left to the file system implementation. I am explaining that in the context of HDFS in this jira. I have it also documented in javadoc for getLargeReadOps() method.
          Hide
          Konstantin Shvachko added a comment -

          +1 Both patches look good to me.

          Show
          Konstantin Shvachko added a comment - +1 Both patches look good to me.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12449390/HADOOP-6859.patch
          against trunk revision 964134.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 javadoc. The javadoc tool appears to have generated 1 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/618/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/618/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/618/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/618/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12449390/HADOOP-6859.patch against trunk revision 964134. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 javadoc. The javadoc tool appears to have generated 1 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/618/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/618/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/618/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/618/console This message is automatically generated.
          Hide
          Suresh Srinivas added a comment -

          I do not get javadoc warning when I run testpatch. Submitting it again to see if it is flagged again.

          Show
          Suresh Srinivas added a comment - I do not get javadoc warning when I run testpatch. Submitting it again to see if it is flagged again.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12449390/HADOOP-6859.patch
          against trunk revision 964134.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 javadoc. The javadoc tool appears to have generated 1 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/619/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/619/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/619/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/619/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12449390/HADOOP-6859.patch against trunk revision 964134. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 javadoc. The javadoc tool appears to have generated 1 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/619/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/619/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/619/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/619/console This message is automatically generated.
          Hide
          Suresh Srinivas added a comment -

          Looking at Hudson console output, the javadoc warning is unrelated to this patch. Here are the javadoc warnings:
          [exec] [javadoc] /grid/0/hudson/hudson-slave/workspace/Hadoop-Patch-h4.grid.sp2.yahoo.net/trunk/src/java/org/apache/hadoop/security/KerberosName.java:31: warning: sun.security.krb5.Config is Sun proprietary API and may be removed in a future release
          [exec] [javadoc] import sun.security.krb5.Config;
          [exec] [javadoc] ^
          [exec] [javadoc] /grid/0/hudson/hudson-slave/workspace/Hadoop-Patch-h4.grid.sp2.yahoo.net/trunk/src/java/org/apache/hadoop/security/KerberosName.java:32: warning: sun.security.krb5.KrbException is Sun proprietary API and may be removed in a future release
          [exec] [javadoc] import sun.security.krb5.KrbException;
          [exec] [javadoc] ^
          [exec] [javadoc] /grid/0/hudson/hudson-slave/workspace/Hadoop-Patch-h4.grid.sp2.yahoo.net/trunk/src/java/org/apache/hadoop/security/KerberosName.java:81: warning: sun.security.krb5.Config is Sun proprietary API and may be removed in a future release
          [exec] [javadoc] private static Config kerbConf;
          [exec] [javadoc] ^
          [exec] [javadoc] /grid/0/hudson/hudson-slave/workspace/Hadoop-Patch-h4.grid.sp2.yahoo.net/trunk/src/java/org/apache/hadoop/security/SecurityUtil.java:33: warning: sun.security.jgss.krb5.Krb5Util is Sun proprietary API and may be removed in a future release
          [exec] [javadoc] import sun.security.jgss.krb5.Krb5Util;
          [exec] [javadoc] ^
          [exec] [javadoc] /grid/0/hudson/hudson-slave/workspace/Hadoop-Patch-h4.grid.sp2.yahoo.net/trunk/src/java/org/apache/hadoop/security/SecurityUtil.java:34: warning: sun.security.krb5.Credentials is Sun proprietary API and may be removed in a future release
          [exec] [javadoc] import sun.security.krb5.Credentials;
          [exec] [javadoc] ^
          [exec] [javadoc] /grid/0/hudson/hudson-slave/workspace/Hadoop-Patch-h4.grid.sp2.yahoo.net/trunk/src/java/org/apache/hadoop/security/SecurityUtil.java:35: warning: sun.security.krb5.PrincipalName is Sun proprietary API and may be removed in a future release
          [exec] [javadoc] import sun.security.krb5.PrincipalName;

          Show
          Suresh Srinivas added a comment - Looking at Hudson console output, the javadoc warning is unrelated to this patch. Here are the javadoc warnings: [exec] [javadoc] /grid/0/hudson/hudson-slave/workspace/Hadoop-Patch-h4.grid.sp2.yahoo.net/trunk/src/java/org/apache/hadoop/security/KerberosName.java:31: warning: sun.security.krb5.Config is Sun proprietary API and may be removed in a future release [exec] [javadoc] import sun.security.krb5.Config; [exec] [javadoc] ^ [exec] [javadoc] /grid/0/hudson/hudson-slave/workspace/Hadoop-Patch-h4.grid.sp2.yahoo.net/trunk/src/java/org/apache/hadoop/security/KerberosName.java:32: warning: sun.security.krb5.KrbException is Sun proprietary API and may be removed in a future release [exec] [javadoc] import sun.security.krb5.KrbException; [exec] [javadoc] ^ [exec] [javadoc] /grid/0/hudson/hudson-slave/workspace/Hadoop-Patch-h4.grid.sp2.yahoo.net/trunk/src/java/org/apache/hadoop/security/KerberosName.java:81: warning: sun.security.krb5.Config is Sun proprietary API and may be removed in a future release [exec] [javadoc] private static Config kerbConf; [exec] [javadoc] ^ [exec] [javadoc] /grid/0/hudson/hudson-slave/workspace/Hadoop-Patch-h4.grid.sp2.yahoo.net/trunk/src/java/org/apache/hadoop/security/SecurityUtil.java:33: warning: sun.security.jgss.krb5.Krb5Util is Sun proprietary API and may be removed in a future release [exec] [javadoc] import sun.security.jgss.krb5.Krb5Util; [exec] [javadoc] ^ [exec] [javadoc] /grid/0/hudson/hudson-slave/workspace/Hadoop-Patch-h4.grid.sp2.yahoo.net/trunk/src/java/org/apache/hadoop/security/SecurityUtil.java:34: warning: sun.security.krb5.Credentials is Sun proprietary API and may be removed in a future release [exec] [javadoc] import sun.security.krb5.Credentials; [exec] [javadoc] ^ [exec] [javadoc] /grid/0/hudson/hudson-slave/workspace/Hadoop-Patch-h4.grid.sp2.yahoo.net/trunk/src/java/org/apache/hadoop/security/SecurityUtil.java:35: warning: sun.security.krb5.PrincipalName is Sun proprietary API and may be removed in a future release [exec] [javadoc] import sun.security.krb5.PrincipalName;
          Hide
          Suresh Srinivas added a comment -

          This patch does not include tests. Tests are done as a part of specific file system implementation. Tests for HDFS will be added in HDFS-1298.

          Show
          Suresh Srinivas added a comment - This patch does not include tests. Tests are done as a part of specific file system implementation. Tests for HDFS will be added in HDFS-1298 .
          Hide
          Suresh Srinivas added a comment -

          I committed the patch.

          Show
          Suresh Srinivas added a comment - I committed the patch.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk-Commit #326 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk-Commit/326/)
          HADOOP-6859 - Introduce additional statistics to FileSystem to track file system operations. Contributed by Suresh Srinivas.

          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #326 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk-Commit/326/ ) HADOOP-6859 - Introduce additional statistics to FileSystem to track file system operations. Contributed by Suresh Srinivas.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk #394 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk/394/)
          HADOOP-6859 - Introduce additional statistics to FileSystem to track file system operations. Contributed by Suresh Srinivas.

          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk #394 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk/394/ ) HADOOP-6859 - Introduce additional statistics to FileSystem to track file system operations. Contributed by Suresh Srinivas.
          Hide
          Suresh Srinivas added a comment -

          y20 version of the patch

          Show
          Suresh Srinivas added a comment - y20 version of the patch

            People

            • Assignee:
              Suresh Srinivas
              Reporter:
              Suresh Srinivas
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development