Details

    • Type: Test
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.8.0, 2.9.0, 3.0.0-alpha1, 2.7.5
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Ref. the following DFSIO output, I was surprised the test throughput was only 17 MB/s, which doesn't make sense for a real cluster. Maybe it's used for other purpose? For users, it may make more sense to give the throughput 1610 MB/s (1228800/763), calculated by Total MBytes processed / Test exec time.

      15/09/28 11:42:23 INFO fs.TestDFSIO: ----- TestDFSIO ----- : write
      15/09/28 11:42:23 INFO fs.TestDFSIO:            Date & time: Mon Sep 28 11:42:23 CST 2015
      15/09/28 11:42:23 INFO fs.TestDFSIO:        Number of files: 100
      15/09/28 11:42:23 INFO fs.TestDFSIO: Total MBytes processed: 1228800.0
      15/09/28 11:42:23 INFO fs.TestDFSIO:      Throughput mb/sec: 17.457387239456878
      15/09/28 11:42:23 INFO fs.TestDFSIO: Average IO rate mb/sec: 17.57563018798828
      15/09/28 11:42:23 INFO fs.TestDFSIO:  IO rate std deviation: 1.7076328985378455
      15/09/28 11:42:23 INFO fs.TestDFSIO:     Test exec time sec: 762.697
      15/09/28 11:42:23 INFO fs.TestDFSIO: 
      

        Issue Links

          Activity

          Hide
          drankye Kai Zheng added a comment -

          Uploaded a patch:
          1. Added a new metric named 'Total Throughput';
          2. Refined the output of float number metrics using format #.##

          Show
          drankye Kai Zheng added a comment - Uploaded a patch: 1. Added a new metric named 'Total Throughput'; 2. Refined the output of float number metrics using format #.##
          Hide
          hadoopqa Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 7m 22s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 1 new or modified test files.
          +1 javac 8m 52s There were no new javac warning messages.
          +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings.
          +1 checkstyle 0m 38s There were no new checkstyle issues.
          +1 whitespace 0m 0s The patch has no lines that end in whitespace.
          +1 install 1m 35s mvn install still works.
          +1 eclipse:eclipse 0m 38s The patch built with eclipse:eclipse.
          +1 findbugs 0m 54s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          -1 mapreduce tests 109m 31s Tests failed in hadoop-mapreduce-client-jobclient.
              129m 57s  



          Reason Tests
          Failed unit tests hadoop.mapreduce.lib.output.TestJobOutputCommitter
            hadoop.mapred.TestNetworkedJob
            hadoop.mapred.TestMRIntermediateDataEncryption
            hadoop.mapred.TestClusterMRNotification
            hadoop.mapred.TestLazyOutput
            hadoop.mapred.TestMRTimelineEventHandling
            hadoop.mapred.TestReduceFetch
            hadoop.mapreduce.TestMapReduceLazyOutput



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12762664/HDFS-9153-v1.patch
          Optional Tests javac unit findbugs checkstyle
          git revision trunk / 66dad85
          hadoop-mapreduce-client-jobclient test log https://builds.apache.org/job/PreCommit-HDFS-Build/12705/artifact/patchprocess/testrun_hadoop-mapreduce-client-jobclient.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/12705/testReport/
          Java 1.7.0_55
          uname Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/12705/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 7m 22s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 1 new or modified test files. +1 javac 8m 52s There were no new javac warning messages. +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings. +1 checkstyle 0m 38s There were no new checkstyle issues. +1 whitespace 0m 0s The patch has no lines that end in whitespace. +1 install 1m 35s mvn install still works. +1 eclipse:eclipse 0m 38s The patch built with eclipse:eclipse. +1 findbugs 0m 54s The patch does not introduce any new Findbugs (version 3.0.0) warnings. -1 mapreduce tests 109m 31s Tests failed in hadoop-mapreduce-client-jobclient.     129m 57s   Reason Tests Failed unit tests hadoop.mapreduce.lib.output.TestJobOutputCommitter   hadoop.mapred.TestNetworkedJob   hadoop.mapred.TestMRIntermediateDataEncryption   hadoop.mapred.TestClusterMRNotification   hadoop.mapred.TestLazyOutput   hadoop.mapred.TestMRTimelineEventHandling   hadoop.mapred.TestReduceFetch   hadoop.mapreduce.TestMapReduceLazyOutput Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12762664/HDFS-9153-v1.patch Optional Tests javac unit findbugs checkstyle git revision trunk / 66dad85 hadoop-mapreduce-client-jobclient test log https://builds.apache.org/job/PreCommit-HDFS-Build/12705/artifact/patchprocess/testrun_hadoop-mapreduce-client-jobclient.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/12705/testReport/ Java 1.7.0_55 uname Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-HDFS-Build/12705/console This message was automatically generated.
          Hide
          wheat9 Haohui Mai added a comment -

          +1. Committing it shortly.

          Show
          wheat9 Haohui Mai added a comment - +1. Committing it shortly.
          Hide
          wheat9 Haohui Mai added a comment -

          I've committed the patch to trunk and branch-2. Thanks Kai Zheng for the contribution.

          Show
          wheat9 Haohui Mai added a comment - I've committed the patch to trunk and branch-2. Thanks Kai Zheng for the contribution.
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Hadoop-trunk-Commit #8843 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8843/)
          HDFS-9153. Pretty-format the output for DFSIO. Contributed by Kai Zheng. (wheat9: rev 000e12f6fa114dfa45377df23acf552e66410838)

          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-trunk-Commit #8843 (See https://builds.apache.org/job/Hadoop-trunk-Commit/8843/ ) HDFS-9153 . Pretty-format the output for DFSIO. Contributed by Kai Zheng. (wheat9: rev 000e12f6fa114dfa45377df23acf552e66410838) hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java
          Hide
          drankye Kai Zheng added a comment -

          Thanks Haohui Mai for committing this!

          Show
          drankye Kai Zheng added a comment - Thanks Haohui Mai for committing this!
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk #2637 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2637/)
          HDFS-9153. Pretty-format the output for DFSIO. Contributed by Kai Zheng. (wheat9: rev 000e12f6fa114dfa45377df23acf552e66410838)

          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk #2637 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2637/ ) HDFS-9153 . Pretty-format the output for DFSIO. Contributed by Kai Zheng. (wheat9: rev 000e12f6fa114dfa45377df23acf552e66410838) hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #697 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/697/)
          HDFS-9153. Pretty-format the output for DFSIO. Contributed by Kai Zheng. (wheat9: rev 000e12f6fa114dfa45377df23acf552e66410838)

          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #697 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/697/ ) HDFS-9153 . Pretty-format the output for DFSIO. Contributed by Kai Zheng. (wheat9: rev 000e12f6fa114dfa45377df23acf552e66410838) hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #708 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/708/)
          HDFS-9153. Pretty-format the output for DFSIO. Contributed by Kai Zheng. (wheat9: rev 000e12f6fa114dfa45377df23acf552e66410838)

          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java
          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #708 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/708/ ) HDFS-9153 . Pretty-format the output for DFSIO. Contributed by Kai Zheng. (wheat9: rev 000e12f6fa114dfa45377df23acf552e66410838) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk #1433 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/1433/)
          HDFS-9153. Pretty-format the output for DFSIO. Contributed by Kai Zheng. (wheat9: rev 000e12f6fa114dfa45377df23acf552e66410838)

          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk #1433 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/1433/ ) HDFS-9153 . Pretty-format the output for DFSIO. Contributed by Kai Zheng. (wheat9: rev 000e12f6fa114dfa45377df23acf552e66410838) hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk #2567 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2567/)
          HDFS-9153. Pretty-format the output for DFSIO. Contributed by Kai Zheng. (wheat9: rev 000e12f6fa114dfa45377df23acf552e66410838)

          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java
          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #2567 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2567/ ) HDFS-9153 . Pretty-format the output for DFSIO. Contributed by Kai Zheng. (wheat9: rev 000e12f6fa114dfa45377df23acf552e66410838) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #629 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/629/)
          HDFS-9153. Pretty-format the output for DFSIO. Contributed by Kai Zheng. (wheat9: rev 000e12f6fa114dfa45377df23acf552e66410838)

          • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #629 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/629/ ) HDFS-9153 . Pretty-format the output for DFSIO. Contributed by Kai Zheng. (wheat9: rev 000e12f6fa114dfa45377df23acf552e66410838) hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/fs/TestDFSIO.java
          Hide
          shv Konstantin Shvachko added a comment - - edited

          Hey guys I think the metrics you introduced is absolutely deceiving, and has nothing to do with the throughput the benchmark is intended to measure.
          "Test exec time" is the running time of the job, which includes the compute overhead: scheduling, cleanup, and retries if there were failed maps.
          While we want to benchmark the average throughput of the actual data transfers on HDFS. You should see the implementation measures time of transfers only.

          The formatting changes are fine. But I think "Total Throughput" should be removed.
          The bug reported in MAPREDUCE-6931 makes it invalid, but even if fixed it is still deceiving.

          Also, DFSIO issues should be filed on HDFS jira. Then you should expect more prompt response.
          Sorry last part was for the other jira. Please ignore.

          Show
          shv Konstantin Shvachko added a comment - - edited Hey guys I think the metrics you introduced is absolutely deceiving, and has nothing to do with the throughput the benchmark is intended to measure. "Test exec time" is the running time of the job, which includes the compute overhead: scheduling, cleanup, and retries if there were failed maps. While we want to benchmark the average throughput of the actual data transfers on HDFS. You should see the implementation measures time of transfers only. The formatting changes are fine. But I think "Total Throughput" should be removed. The bug reported in MAPREDUCE-6931 makes it invalid, but even if fixed it is still deceiving. Also, DFSIO issues should be filed on HDFS jira. Then you should expect more prompt response. Sorry last part was for the other jira. Please ignore.
          Hide
          drankye Kai Zheng added a comment -

          Hi Konstantin Shvachko,

          Thanks for pointing this. Yes "Total Throughput" isn't accurate particularly for smaller IO operations. Before we used it for large file read/write operations where IO part is the dominant. Looks like "Throughput" and "Average IO rate" are good enough and more accurate, I agree we should remove the new metrics.

          Show
          drankye Kai Zheng added a comment - Hi Konstantin Shvachko , Thanks for pointing this. Yes "Total Throughput" isn't accurate particularly for smaller IO operations. Before we used it for large file read/write operations where IO part is the dominant. Looks like "Throughput" and "Average IO rate" are good enough and more accurate, I agree we should remove the new metrics.
          Hide
          shv Konstantin Shvachko added a comment -

          Committed to branch-2.7 along with MAPREDUCE-6931.
          Updates versions.

          Show
          shv Konstantin Shvachko added a comment - Committed to branch-2.7 along with MAPREDUCE-6931 . Updates versions.

            People

            • Assignee:
              drankye Kai Zheng
              Reporter:
              drankye Kai Zheng
            • Votes:
              1 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development