Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-6287

Add vecsum test of libhdfs read access times

    Details

    • Type: Test Test
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 2.5.0
    • Fix Version/s: 2.5.0
    • Component/s: libhdfs, test
    • Labels:
      None
    • Target Version/s:

      Description

      Add vecsum, a benchmark that tests libhdfs access times. This includes short-circuit, zero-copy, and standard libhdfs access modes. It also has a local filesystem mode for comparison.

      1. HDFS-6287.006.patch
        25 kB
        Colin Patrick McCabe
      2. HDFS-6287.005.patch
        25 kB
        Colin Patrick McCabe
      3. HDFS-6287.004.patch
        25 kB
        Colin Patrick McCabe
      4. HDFS-6287.003.patch
        24 kB
        Colin Patrick McCabe
      5. HDFS-6287.002.patch
        23 kB
        Colin Patrick McCabe
      6. HDFS-6282.001.patch
        23 kB
        Colin Patrick McCabe

        Issue Links

          Activity

          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12641990/HDFS-6282.001.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          -1 javac. The patch appears to cause the build to fail.

          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6737//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12641990/HDFS-6282.001.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. -1 javac . The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6737//console This message is automatically generated.
          Hide
          Colin Patrick McCabe added a comment -

          Looks like the SSE intrinsics could not be found. I'm going to try again with #include <emmintrin.h>. If this doesn't work, I guess we'll have to make it auto-detect whether it can use SSE, or provide a compile option.

          Show
          Colin Patrick McCabe added a comment - Looks like the SSE intrinsics could not be found. I'm going to try again with #include <emmintrin.h>. If this doesn't work, I guess we'll have to make it auto-detect whether it can use SSE, or provide a compile option.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12642003/HDFS-6287.002.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          -1 javac. The patch appears to cause the build to fail.

          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6738//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12642003/HDFS-6287.002.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. -1 javac . The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6738//console This message is automatically generated.
          Hide
          Colin Patrick McCabe added a comment -

          OK, time to implement auto-detection of SSE, I guess...

          Show
          Colin Patrick McCabe added a comment - OK, time to implement auto-detection of SSE, I guess...
          Hide
          Colin Patrick McCabe added a comment -

          Here's a version that tries to compile with SSE intrinsics, and falls back on a cross-simple platform loop if that fails.

          Show
          Colin Patrick McCabe added a comment - Here's a version that tries to compile with SSE intrinsics, and falls back on a cross-simple platform loop if that fails.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12642283/HDFS-6287.003.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          -1 javac. The patch appears to cause the build to fail.

          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6756//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12642283/HDFS-6287.003.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. -1 javac . The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6756//console This message is automatically generated.
          Hide
          Colin Patrick McCabe added a comment -

          Looks like on older glibc versions like the one our jenkins machines are using, you needed to link with librt to use clock_gettime. Added. I also fixed a warning message in test_libhdfs_threaded

          Show
          Colin Patrick McCabe added a comment - Looks like on older glibc versions like the one our jenkins machines are using, you needed to link with librt to use clock_gettime . Added. I also fixed a warning message in test_libhdfs_threaded
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12642293/HDFS-6287.004.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 2 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6757//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6757//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12642293/HDFS-6287.004.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 2 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6757//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6757//console This message is automatically generated.
          Hide
          Chris Nauroth added a comment -

          Hi, Colin. Thanks for posting this. Did you find that you needed to use SSE to get the addition fast enough so that the benchmark highlights read throughput instead of sum computation? IOW, could we potentially simplify this patch to not use SSE at all and still have a valid benchmark?

          I think it would be helpful to add a comment with a high-level summary of what vecsum does, maybe right before the main.

          I have one minor comment on the code itself so far. I think you can remove the hdfsFreeBuilder call. hdfsBuilderConnect always frees the builder, whether it succeeds or fails. The only time you would need to call hdfsFreeBuilder directly is if you allocated a builder but then never attempted to connect with it. I don't see any way for that to happen in the libhdfs_data_create code.

          Show
          Chris Nauroth added a comment - Hi, Colin. Thanks for posting this. Did you find that you needed to use SSE to get the addition fast enough so that the benchmark highlights read throughput instead of sum computation? IOW, could we potentially simplify this patch to not use SSE at all and still have a valid benchmark? I think it would be helpful to add a comment with a high-level summary of what vecsum does, maybe right before the main . I have one minor comment on the code itself so far. I think you can remove the hdfsFreeBuilder call. hdfsBuilderConnect always frees the builder, whether it succeeds or fails. The only time you would need to call hdfsFreeBuilder directly is if you allocated a builder but then never attempted to connect with it. I don't see any way for that to happen in the libhdfs_data_create code.
          Hide
          Colin Patrick McCabe added a comment -

          Hi, Colin. Thanks for posting this. Did you find that you needed to use SSE to get the addition fast enough so that the benchmark highlights read throughput instead of sum computation? IOW, could we potentially simplify this patch to not use SSE at all and still have a valid benchmark?

          Without that optimization, the benchmark quickly becomes CPU-bound and you don't get true numbers for ZCR and other fast read methods. I just benchmarked 1.5 GB/s for the un-optimized version versus 5.7 GB/s for the optimized.

          I think it would be helpful to add a comment with a high-level summary of what vecsum does, maybe right before the main.

          Added

          I have one minor comment on the code itself so far. I think you can remove the hdfsFreeBuilder call. hdfsBuilderConnect always frees the builder, whether it succeeds or fails. The only time you would need to call hdfsFreeBuilder directly is if you allocated a builder but then never attempted to connect with it. I don't see any way for that to happen in the libhdfs_data_create code.

          Yeah, that is deadcode. Let me remove that

          Show
          Colin Patrick McCabe added a comment - Hi, Colin. Thanks for posting this. Did you find that you needed to use SSE to get the addition fast enough so that the benchmark highlights read throughput instead of sum computation? IOW, could we potentially simplify this patch to not use SSE at all and still have a valid benchmark? Without that optimization, the benchmark quickly becomes CPU-bound and you don't get true numbers for ZCR and other fast read methods. I just benchmarked 1.5 GB/s for the un-optimized version versus 5.7 GB/s for the optimized. I think it would be helpful to add a comment with a high-level summary of what vecsum does, maybe right before the main. Added I have one minor comment on the code itself so far. I think you can remove the hdfsFreeBuilder call. hdfsBuilderConnect always frees the builder, whether it succeeds or fails. The only time you would need to call hdfsFreeBuilder directly is if you allocated a builder but then never attempted to connect with it. I don't see any way for that to happen in the libhdfs_data_create code. Yeah, that is deadcode. Let me remove that
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12643399/HDFS-6287.005.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 2 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6849//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6849//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12643399/HDFS-6287.005.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 2 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6849//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6849//console This message is automatically generated.
          Hide
          Colin Patrick McCabe added a comment -
          • Fix an issue with the 'local' option where we didn't create a file which was as long as we should.
          Show
          Colin Patrick McCabe added a comment - Fix an issue with the 'local' option where we didn't create a file which was as long as we should.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12643874/HDFS-6287.006.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 2 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs:

          org.apache.hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6874//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6874//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12643874/HDFS-6287.006.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 2 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6874//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6874//console This message is automatically generated.
          Hide
          Andrew Wang added a comment -

          +1 thanks Colin

          Show
          Andrew Wang added a comment - +1 thanks Colin
          Hide
          Hudson added a comment -

          SUCCESS: Integrated in Hadoop-trunk-Commit #5605 (See https://builds.apache.org/job/Hadoop-trunk-Commit/5605/)
          HDFS-6287. Add vecsum test of libhdfs read access times (cmccabe) (cmccabe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1594751)

          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/CMakeLists.txt
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/config.h.cmake
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/test/vecsum.c
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/test_libhdfs_threaded.c
          Show
          Hudson added a comment - SUCCESS: Integrated in Hadoop-trunk-Commit #5605 (See https://builds.apache.org/job/Hadoop-trunk-Commit/5605/ ) HDFS-6287 . Add vecsum test of libhdfs read access times (cmccabe) (cmccabe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1594751 ) /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/CMakeLists.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/config.h.cmake /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/test/vecsum.c /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/test_libhdfs_threaded.c
          Hide
          Jason Lowe added a comment -

          This change breaks the build on RHEL4 because it doesn't have RUSAGE_THREAD. Yes RHEL4 is ancient, but we build against it in a 32-bit compatibility environment.

          Show
          Jason Lowe added a comment - This change breaks the build on RHEL4 because it doesn't have RUSAGE_THREAD. Yes RHEL4 is ancient, but we build against it in a 32-bit compatibility environment.
          Hide
          Colin Patrick McCabe added a comment -

          Create a JIRA? I will review any patches.

          Looks like we also need to remove #include malloc.h, since FreeBSD doesn't have it.

          Show
          Colin Patrick McCabe added a comment - Create a JIRA? I will review any patches. Looks like we also need to remove #include malloc.h , since FreeBSD doesn't have it.
          Hide
          Jason Lowe added a comment -

          I commented initially since I wasn't sure if you wanted it addressed here or separately. Filed HDFS-6421.

          Show
          Jason Lowe added a comment - I commented initially since I wasn't sure if you wanted it addressed here or separately. Filed HDFS-6421 .
          Hide
          Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk #562 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/562/)
          HDFS-6287. Add vecsum test of libhdfs read access times (cmccabe) (cmccabe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1594751)

          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/CMakeLists.txt
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/config.h.cmake
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/test/vecsum.c
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/test_libhdfs_threaded.c
          Show
          Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk #562 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/562/ ) HDFS-6287 . Add vecsum test of libhdfs read access times (cmccabe) (cmccabe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1594751 ) /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/CMakeLists.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/config.h.cmake /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/test/vecsum.c /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/test_libhdfs_threaded.c
          Hide
          Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk #1754 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1754/)
          HDFS-6287. Add vecsum test of libhdfs read access times (cmccabe) (cmccabe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1594751)

          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/CMakeLists.txt
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/config.h.cmake
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/test/vecsum.c
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/test_libhdfs_threaded.c
          Show
          Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #1754 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1754/ ) HDFS-6287 . Add vecsum test of libhdfs read access times (cmccabe) (cmccabe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1594751 ) /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/CMakeLists.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/config.h.cmake /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/test/vecsum.c /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/test_libhdfs_threaded.c
          Hide
          Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk #1780 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1780/)
          HDFS-6287. Add vecsum test of libhdfs read access times (cmccabe) (cmccabe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1594751)

          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/CMakeLists.txt
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/config.h.cmake
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/test/vecsum.c
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/test_libhdfs_threaded.c
          Show
          Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk #1780 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1780/ ) HDFS-6287 . Add vecsum test of libhdfs read access times (cmccabe) (cmccabe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1594751 ) /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/CMakeLists.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/config.h.cmake /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/test/vecsum.c /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/libhdfs/test_libhdfs_threaded.c

            People

            • Assignee:
              Colin Patrick McCabe
              Reporter:
              Colin Patrick McCabe
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development