Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-458

Create target for 10 minute patch test build for hdfs

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.21.0
    • Component/s: build, test
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      It would be good to identify a subset of hdfs tests that provide strong test code coverage within 10 minutes, as is the goal of MAPREDUCE-670 and HADOOP-5628.

      1. TenMinuteTestData.xlsx
        97 kB
        Jakob Homan
      2. jira.HDFS-458.branch-0.21.1xx.patch
        6 kB
        Erik Steffl
      3. jira.HDFS-458.branch-0.20.1xx.patch
        4 kB
        Erik Steffl
      4. HDFS-458.patch
        5 kB
        Jakob Homan
      5. HDFS-458.patch
        8 kB
        Jakob Homan
      6. build.xml
        92 kB
        Jakob Homan

        Issue Links

          Activity

          Hide
          Jakob Homan added a comment -

          I've completed initial work and am attaching a spreadsheet with the results (along with the hacked pre-split build file used). The HDFS unit tests were benchmarked repeatedly and Clover was used to generate coverage reports. From these data, a subset of tests TT2 was identified that provide maximum coverage within the 10 minute constraint.

          The end result is a proposed test set that runs on median in 9.22 minutes (compared to 40 minutes for all hdfs tests) and provides 88% of the original test coverage. I think this is a pretty good result. The median is used because a fair proportion of tests have shown outlier running times. Unfortunately, our initial code coverage number of 49% is lacking.

          A focus was put on maintaining as much as possible the code coverage of the major HDFS classes, including Namenode, FSNamesystem, DFSClient, etc. Many of the tests within hdfs effectively act as integration tests (particularly when a test invokes the MiniDFSCluster), making it relatively easy to identify tests that stress these classes and code paths.

          Another result in the spreadsheet is the identification of several tests with very large variations in run time that should be stabilized.

          The build script defines another build target, run-test-ten, which runs the tests that are defined as part of the ten-minute test. You can use this to play with other combos, if you like. However, it's a pre-split build script.

          Suggestions? Comments? Snark?

          Show
          Jakob Homan added a comment - I've completed initial work and am attaching a spreadsheet with the results (along with the hacked pre-split build file used). The HDFS unit tests were benchmarked repeatedly and Clover was used to generate coverage reports. From these data, a subset of tests TT2 was identified that provide maximum coverage within the 10 minute constraint. The end result is a proposed test set that runs on median in 9.22 minutes (compared to 40 minutes for all hdfs tests) and provides 88% of the original test coverage. I think this is a pretty good result. The median is used because a fair proportion of tests have shown outlier running times. Unfortunately, our initial code coverage number of 49% is lacking. A focus was put on maintaining as much as possible the code coverage of the major HDFS classes, including Namenode, FSNamesystem, DFSClient, etc. Many of the tests within hdfs effectively act as integration tests (particularly when a test invokes the MiniDFSCluster), making it relatively easy to identify tests that stress these classes and code paths. Another result in the spreadsheet is the identification of several tests with very large variations in run time that should be stabilized. The build script defines another build target, run-test-ten, which runs the tests that are defined as part of the ten-minute test. You can use this to play with other combos, if you like. However, it's a pre-split build script. Suggestions? Comments? Snark?
          Hide
          Nigel Daley added a comment -

          This looks great. Can you go ahead and list these tests into a Junit test suite class? I think that is the best way for now to capture them into a single suite that is run by the commit build.

          Can you also coordinate with MAPREDUCE-670 so that the build target is the same. Perhaps "ant full-test" for all of them and "ant unit-test" for these fast ones?

          Show
          Nigel Daley added a comment - This looks great. Can you go ahead and list these tests into a Junit test suite class? I think that is the best way for now to capture them into a single suite that is run by the commit build. Can you also coordinate with MAPREDUCE-670 so that the build target is the same. Perhaps "ant full-test" for all of them and "ant unit-test" for these fast ones?
          Hide
          Tom White added a comment -

          Or perhaps "ant patch-test" for the fast ones and keep "ant test" for the full suite? As Jakob notes, many of the tests are more than simple unit tests.

          Show
          Tom White added a comment - Or perhaps "ant patch-test" for the fast ones and keep "ant test" for the full suite? As Jakob notes, many of the tests are more than simple unit tests.
          Hide
          Jakob Homan added a comment -

          Can you go ahead and list these tests into a Junit test suite class? I think that is the best way for now to capture them into a single suite that is run by the commit build.

          Another option would be to specify the tests in a flat file (ten-minute-test or such):

          **/TestHDFSCLI.java
          **/TestUrlStreamHandler.java
          **/TestBlockReplacement.java
          **/TestDirectoryScanner.java
          **/TestDiskError.java
          etc...

          and then reference that file from within the ant target:

          <fileset dir="${test.src.dir}/hdfs" casesensitive="yes">
          
          <patternset>
            <includesfile name="ten-minute-test" />
          </patternset>
          </fileset>
          

          This would make it easier to change the tests included, without having to modify any code. Might be less opaque.

          Show
          Jakob Homan added a comment - Can you go ahead and list these tests into a Junit test suite class? I think that is the best way for now to capture them into a single suite that is run by the commit build. Another option would be to specify the tests in a flat file (ten-minute-test or such): **/TestHDFSCLI.java **/TestUrlStreamHandler.java **/TestBlockReplacement.java **/TestDirectoryScanner.java **/TestDiskError.java etc... and then reference that file from within the ant target: <fileset dir="${test.src.dir}/hdfs" casesensitive="yes"> <patternset> <includesfile name="ten-minute-test" /> </patternset> </fileset> This would make it easier to change the tests included, without having to modify any code. Might be less opaque.
          Hide
          Jakob Homan added a comment -

          Also, regarding the project split (which was done after my performance numbers). One test that had been included, TestFTPFileSystem, is no longer included in hdfs tests. This is fine, it was in the ten-minute test.

          Six new tests were added: TestFileSystem, TestNameNodeMetrics, TestSequenceFileMergeProgress, TestServiceLevelAuthorization, and TestSocketFactory. These aren't critical tests, but should be considered for inclusion later on. The test set definition will by necessity be fluid as new tests are written. Ideally, we should get all tests to have a shorter running time, which will let us squeeze more into the ten minutes.

          Show
          Jakob Homan added a comment - Also, regarding the project split (which was done after my performance numbers). One test that had been included, TestFTPFileSystem, is no longer included in hdfs tests. This is fine, it was in the ten-minute test. Six new tests were added: TestFileSystem, TestNameNodeMetrics, TestSequenceFileMergeProgress, TestServiceLevelAuthorization, and TestSocketFactory. These aren't critical tests, but should be considered for inclusion later on. The test set definition will by necessity be fluid as new tests are written. Ideally, we should get all tests to have a shorter running time, which will let us squeeze more into the ten minutes.
          Hide
          Jakob Homan added a comment -

          Done with patch. Created a separate file inside of src/test, commit-tests, which is the list of files to be included in the set. This file can be updated as the set's definition changes.

          Updated build.xml to include a new target, run-commit-test which executes the set. Tom had suggested patch-test, but I'm afraid this would be confusing with test-patch. I had hoped to avoid code duplication within the build file for the new target: it should have same configuration as the regular, full test suite. Only the fileset definition is different. However, after playing around for quite a while, I couldn't get a good conditional setup going due to the many limitations of Ant. Since the run-test-hdfs-with-mr target also duplicates a lot of code, I'm guessing this is just the way it'll have to be.

          One thing I don't like is Ant/Junit doesn't complain if it can't find a test specified in the test file (not even if debug is turned on). This means we'll have to keep an active eye on changes to the tests and if they are put in the ten-minute test suite. This should be ameliorated once we move to a testing framework that allows better categorization of tests.

          Show
          Jakob Homan added a comment - Done with patch. Created a separate file inside of src/test, commit-tests, which is the list of files to be included in the set. This file can be updated as the set's definition changes. Updated build.xml to include a new target, run-commit-test which executes the set. Tom had suggested patch-test, but I'm afraid this would be confusing with test-patch. I had hoped to avoid code duplication within the build file for the new target: it should have same configuration as the regular, full test suite. Only the fileset definition is different. However, after playing around for quite a while, I couldn't get a good conditional setup going due to the many limitations of Ant. Since the run-test-hdfs-with-mr target also duplicates a lot of code, I'm guessing this is just the way it'll have to be. One thing I don't like is Ant/Junit doesn't complain if it can't find a test specified in the test file (not even if debug is turned on). This means we'll have to keep an active eye on changes to the tests and if they are put in the ten-minute test suite. This should be ameliorated once we move to a testing framework that allows better categorization of tests.
          Hide
          Jakob Homan added a comment -

          submitting patch

          Show
          Jakob Homan added a comment - submitting patch
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12412913/HDFS-458.patch
          against trunk revision 792310.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 52 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          -1 release audit. The applied patch generated 283 release audit warnings (more than the trunk's current 282 warnings).

          -1 core tests. The patch failed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/10/testReport/
          Release audit warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/10/artifact/trunk/current/releaseAuditDiffWarnings.txt
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/10/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/10/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/10/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12412913/HDFS-458.patch against trunk revision 792310. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 52 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. -1 release audit. The applied patch generated 283 release audit warnings (more than the trunk's current 282 warnings). -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/10/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/10/artifact/trunk/current/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/10/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/10/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/10/console This message is automatically generated.
          Hide
          Jakob Homan added a comment -

          The contrib test failure is unrelated.
          The core test is due the "create-c++-configure" target not existing, a problem with hudson.
          The release audit? I can't tell, there's a 404 on the release audit warning and when I run it locally I get gibberish...

          Show
          Jakob Homan added a comment - The contrib test failure is unrelated. The core test is due the "create-c++-configure" target not existing, a problem with hudson. The release audit? I can't tell, there's a 404 on the release audit warning and when I run it locally I get gibberish...
          Hide
          Tsz Wo Nicholas Sze added a comment -

          run-commit-test and run-test-hdfs have quite large amount of repeated codes. Could we create a common target for them?

          Show
          Tsz Wo Nicholas Sze added a comment - run-commit-test and run-test-hdfs have quite large amount of repeated codes. Could we create a common target for them?
          Hide
          Jakob Homan added a comment -

          Canceling patch to address Nicholas' comments.

          Show
          Jakob Homan added a comment - Canceling patch to address Nicholas' comments.
          Hide
          gary murry added a comment -

          +1 on the suite of tests to be included. Great work!!!

          Show
          gary murry added a comment - +1 on the suite of tests to be included. Great work!!!
          Hide
          Jakob Homan added a comment -

          Updated patch to share the common code for both the regular tests and the common tests, using a macro. Thanks to Nicholas for suggesting the workaround. I don't believe there's any lose of functionality.

          Show
          Jakob Homan added a comment - Updated patch to share the common code for both the regular tests and the common tests, using a macro. Thanks to Nicholas for suggesting the workaround. I don't believe there's any lose of functionality.
          Hide
          Jakob Homan added a comment -

          submitting patch

          Show
          Jakob Homan added a comment - submitting patch
          Hide
          Konstantin Boudnik added a comment -

          I am inclined to have 10 minutes tests to be collected in a JUnit's test suite instead of an external file. The reason is simple: with the file we'd have two points of maintenance: tests source code and one an auxiliary text files. Besides, a general approach is to use JUnit's suites for tagging purposes, thus having a suite here would be more uniform.

          Show
          Konstantin Boudnik added a comment - I am inclined to have 10 minutes tests to be collected in a JUnit's test suite instead of an external file. The reason is simple: with the file we'd have two points of maintenance: tests source code and one an auxiliary text files. Besides, a general approach is to use JUnit's suites for tagging purposes, thus having a suite here would be more uniform.
          Hide
          Jakob Homan added a comment -

          I considered that but thought that having one easy-to-edit source of tests would be a benefit. Add a test? Add its name. Delete a delete? Delete it. I think a plain text file is good for this type of info. Further, having them run in the same configuration as they would normally be run through ant via the update build file is nice as well.
          That being said, once the tagging/test categorization setup is complete, I definitely want to move to using that. I'd much rather be able to prefer to be able to annotate specific tests within each test suite to be included in the 10-minute set. Right now we have to have all the tests from each suite included, rather than on a test-by-test basis. As soon as that's up and running, we should convert over.

          Show
          Jakob Homan added a comment - I considered that but thought that having one easy-to-edit source of tests would be a benefit. Add a test? Add its name. Delete a delete? Delete it. I think a plain text file is good for this type of info. Further, having them run in the same configuration as they would normally be run through ant via the update build file is nice as well. That being said, once the tagging/test categorization setup is complete, I definitely want to move to using that. I'd much rather be able to prefer to be able to annotate specific tests within each test suite to be included in the 10-minute set. Right now we have to have all the tests from each suite included, rather than on a test-by-test basis. As soon as that's up and running, we should convert over.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12413892/HDFS-458.patch
          against trunk revision 794953.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 55 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          -1 release audit. The applied patch generated 284 release audit warnings (more than the trunk's current 282 warnings).

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/25/testReport/
          Release audit warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/25/artifact/trunk/current/releaseAuditDiffWarnings.txt
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/25/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/25/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/25/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12413892/HDFS-458.patch against trunk revision 794953. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 55 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. -1 release audit. The applied patch generated 284 release audit warnings (more than the trunk's current 282 warnings). -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/25/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/25/artifact/trunk/current/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/25/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/25/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/25/console This message is automatically generated.
          Hide
          Giridharan Kesavan added a comment -

          Please use this link for Releaseaudit warnigns:
          http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/25/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt
          I ve this fixed for the latest patch-test builds.

          Show
          Giridharan Kesavan added a comment - Please use this link for Releaseaudit warnigns: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-vesta.apache.org/25/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt I ve this fixed for the latest patch-test builds.
          Hide
          Jakob Homan added a comment -

          OK, Hudson is completely wonky on this one.

          -1 release audit. The applied patch generated 284 release audit warnings (more than the trunk's current 282 warnings).

          The two new warnings are due to the files that list the tests don't have apache licenses. There's no support for this in ant. They can't have the licenses (it could conceivably work, but may screw things up later). This is OK.

          -1 core tests. The patch failed core unit tests.

          Hudson is basing this on the fact that the number of tests went down from the previous patch it tested. However, that patch (HDFS-492), which introduced a new test, has not been committed. Therefore this is not a valid way of determining if there was a problem with the tests. I've spoken with Giri about this. Again, this is ok.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          If you look at the raw console output it actually says this patch improved the number of javac warnings, which isn't possible. Again, this is ok.
          The patch is good to go.

          Show
          Jakob Homan added a comment - OK, Hudson is completely wonky on this one. -1 release audit. The applied patch generated 284 release audit warnings (more than the trunk's current 282 warnings). The two new warnings are due to the files that list the tests don't have apache licenses. There's no support for this in ant. They can't have the licenses (it could conceivably work, but may screw things up later). This is OK. -1 core tests. The patch failed core unit tests. Hudson is basing this on the fact that the number of tests went down from the previous patch it tested. However, that patch ( HDFS-492 ), which introduced a new test, has not been committed. Therefore this is not a valid way of determining if there was a problem with the tests. I've spoken with Giri about this. Again, this is ok. +1 javac. The applied patch does not increase the total number of javac compiler warnings. If you look at the raw console output it actually says this patch improved the number of javac warnings, which isn't possible. Again, this is ok. The patch is good to go.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          +1 patch looks good. I agree that the patch is ready to go.

          The javac problem was reported in HADOOP-6124.

          Show
          Tsz Wo Nicholas Sze added a comment - +1 patch looks good. I agree that the patch is ready to go. The javac problem was reported in HADOOP-6124 .
          Hide
          gary murry added a comment -

          +1 From a QA stand point, everything has been account for. We just want to make sure there are bugs in for all the issues that were raised. (According to Jakob there are.)

          Show
          gary murry added a comment - +1 From a QA stand point, everything has been account for. We just want to make sure there are bugs in for all the issues that were raised. (According to Jakob there are.)
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Note that the patch has removed the aop dir from run-test-hdfs target. The aop related targets will be done in HDFS-493.

          -        <fileset dir="${test.src.dir}/aop"
          -          includes="**/${test.include}.java"
          -          excludes="**/${test.exclude}.java" />
          

          Tried "ant run-test-hdfs" and "ant run-commit-test" manually. Both worked fine.

          I have committed this. Thanks, Jakob!

          Show
          Tsz Wo Nicholas Sze added a comment - Note that the patch has removed the aop dir from run-test-hdfs target. The aop related targets will be done in HDFS-493 . - <fileset dir= "${test.src.dir}/aop" - includes= "**/${test.include}.java" - excludes= "**/${test.exclude}.java" /> Tried "ant run-test-hdfs" and "ant run-commit-test" manually. Both worked fine. I have committed this. Thanks, Jakob!
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Question: Given that a patch has already passed review and test-patch, is it ready to be committed once it has passed "ant run-commit-test" (instead of "ant test")?

          Show
          Tsz Wo Nicholas Sze added a comment - Question: Given that a patch has already passed review and test-patch, is it ready to be committed once it has passed "ant run-commit-test" (instead of "ant test")?
          Hide
          Konstantin Boudnik added a comment -

          +1 on using 'run-commit-test' for patch verifications.

          Show
          Konstantin Boudnik added a comment - +1 on using 'run-commit-test' for patch verifications.

            People

            • Assignee:
              Jakob Homan
              Reporter:
              Jakob Homan
            • Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development