Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 0.19.1
    • Fix Version/s: 0.21.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Add I/O duration information into client trace log for analyzing performance.

      1. duration.patch
        5 kB
        Lei (Eddy) Xu
      2. duration-1.patch
        7 kB
        Lei (Eddy) Xu
      3. duration-0.19.1.patch
        7 kB
        Lei (Eddy) Xu
      4. duration-3.patch
        7 kB
        Lei (Eddy) Xu
      5. HADOOP-5625-BP-to-20.patch
        7 kB
        Jakob Homan

        Activity

        Hide
        Robert Chansler added a comment -

        Editorial pass over all release notes prior to publication of 0.21. Routine.

        Show
        Robert Chansler added a comment - Editorial pass over all release notes prior to publication of 0.21. Routine.
        Hide
        Jakob Homan added a comment -

        Attaching patch for backporting this feature to Y!'s 20. This patch is dependent on HADOOP-5222, which applies against Y! 20 cleanly.

        Show
        Jakob Homan added a comment - Attaching patch for backporting this feature to Y!'s 20. This patch is dependent on HADOOP-5222 , which applies against Y! 20 cleanly.
        Hide
        Hudson added a comment -

        Integrated in Hadoop-trunk #813 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/813/)
        . Add operation duration to clienttrace. Contributed by Lei Xu

        Show
        Hudson added a comment - Integrated in Hadoop-trunk #813 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/813/ ) . Add operation duration to clienttrace. Contributed by Lei Xu
        Hide
        Chris Douglas added a comment -

        I committed this. Thanks, Lei

        Show
        Chris Douglas added a comment - I committed this. Thanks, Lei
        Hide
        Chris Douglas added a comment -
        Test org.apache.hadoop.hdfs.server.namenode.TestReplicationPolicy FAILED
        Test org.apache.hadoop.mapred.TestMRServerPorts FAILED
        Test org.apache.hadoop.mapred.TestQueueCapacities FAILED
        

        None of the test failures are related to this patch.

        Show
        Chris Douglas added a comment - Test org.apache.hadoop.hdfs.server.namenode.TestReplicationPolicy FAILED Test org.apache.hadoop.mapred.TestMRServerPorts FAILED Test org.apache.hadoop.mapred.TestQueueCapacities FAILED None of the test failures are related to this patch.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12405185/duration-3.patch
        against trunk revision 764085.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no tests are needed for this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed core unit tests.

        -1 contrib tests. The patch failed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/182/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/182/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/182/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/182/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12405185/duration-3.patch against trunk revision 764085. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/182/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/182/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/182/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/182/console This message is automatically generated.
        Hide
        Chris Douglas added a comment -

        +1

        Show
        Chris Douglas added a comment - +1
        Hide
        Lei (Eddy) Xu added a comment -

        Use System.nanoTime() to measure the duration time for I/O operation.

        Show
        Lei (Eddy) Xu added a comment - Use System.nanoTime() to measure the duration time for I/O operation.
        Hide
        Lei (Eddy) Xu added a comment -

        This patch works for 0.19.1. It has to be applied after HADOOP-5222.

        Since there are some format issues between 0.19.1 and svn HEAD. So I submit this one for build a 0.19.1 release.

        Show
        Lei (Eddy) Xu added a comment - This patch works for 0.19.1. It has to be applied after HADOOP-5222 . Since there are some format issues between 0.19.1 and svn HEAD. So I submit this one for build a 0.19.1 release.
        Hide
        Hong Tang added a comment -

        If we only care about lapsed time, maybe System.nanoTime a better choice? (offering better resolution and is less expensive than currentTimeMillis)

        Show
        Hong Tang added a comment - If we only care about lapsed time, maybe System.nanoTime a better choice? (offering better resolution and is less expensive than currentTimeMillis)
        Hide
        Lei (Eddy) Xu added a comment -

        I put the endTime assignment just after receiver close(), You are right, this approach is more accurate .

        And I add an entry into TaskTracker, however I am not familiar with TaskTracker, I am not so sure it is correct.

        Show
        Lei (Eddy) Xu added a comment - I put the endTime assignment just after receiver close(), You are right, this approach is more accurate . And I add an entry into TaskTracker, however I am not familiar with TaskTracker, I am not so sure it is correct.
        Hide
        Chris Douglas added a comment -

        This will be useful. A couple nits:

        • System.currentTimeMillis can be fairly expensive; would it make sense to guard the call with ClientTraceLog.isInfoEnabled? It should also be final, as in
          final long starttime = ClientTraceLog.isInfoEnabled() ? System.currentTimeMillis() : 0;
        • Similarly, endtime should be final and assigned right after the operation completes. In BlockReceiver, it's not obvious to me whether it should follow the set of synchronized calls into the datanode that precede it (as in the current patch) or if the time of the transfer is more accurately recorded before those calls. If this will be used to calculate network metrics, then the latter should be preferred.
        • This should also add an entry for duration to the shuffle metrics in the TT clientrace log, in TaskTracker.MapOutputServlet
        Show
        Chris Douglas added a comment - This will be useful. A couple nits: System.currentTimeMillis can be fairly expensive; would it make sense to guard the call with ClientTraceLog.isInfoEnabled ? It should also be final, as in final long starttime = ClientTraceLog.isInfoEnabled() ? System .currentTimeMillis() : 0; Similarly, endtime should be final and assigned right after the operation completes. In BlockReceiver, it's not obvious to me whether it should follow the set of synchronized calls into the datanode that precede it (as in the current patch) or if the time of the transfer is more accurately recorded before those calls. If this will be used to calculate network metrics, then the latter should be preferred. This should also add an entry for duration to the shuffle metrics in the TT clientrace log, in TaskTracker.MapOutputServlet
        Hide
        Lei (Eddy) Xu added a comment -

        It fails from

        [junit] Running org.apache.hadoop.cli.TestCLI
        [junit] Tests run: 1, Failures: 1, Errors: 0, Time elapsed: 34 sec
        [junit] Test org.apache.hadoop.cli.TestCLI FAILED

        which is not caused by my patch, since svn HEAD source fails too.

        Show
        Lei (Eddy) Xu added a comment - It fails from [junit] Running org.apache.hadoop.cli.TestCLI [junit] Tests run: 1, Failures: 1, Errors: 0, Time elapsed: 34 sec [junit] Test org.apache.hadoop.cli.TestCLI FAILED which is not caused by my patch, since svn HEAD source fails too.
        Hide
        Brian Bockelman added a comment -

        Hi,

        This patch is useful for us to do timing statistics on our cluster - without it, we can't tell how many MB/s we're doing via pure HDFS.

        Compiles and works for me – tests pass. Don't know what the Hudson failure is from.

        Brian

        Show
        Brian Bockelman added a comment - Hi, This patch is useful for us to do timing statistics on our cluster - without it, we can't tell how many MB/s we're doing via pure HDFS. Compiles and works for me – tests pass. Don't know what the Hudson failure is from. Brian
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12404577/duration.patch
        against trunk revision 761632.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no tests are needed for this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/112/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/112/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/112/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/112/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12404577/duration.patch against trunk revision 761632. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/112/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/112/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/112/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-minerva.apache.org/112/console This message is automatically generated.
        Hide
        Lei (Eddy) Xu added a comment -

        Put duration time into client trace

        Show
        Lei (Eddy) Xu added a comment - Put duration time into client trace

          People

          • Assignee:
            Lei (Eddy) Xu
            Reporter:
            Lei (Eddy) Xu
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development