Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-1877

Create a functional test for file read/write

    Details

    • Type: Test Test
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.23.0
    • Component/s: test
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      It would be a great to have a tool, running on a real grid, to perform function test (and stress tests to certain extent) for the file operations. The tool would be written in Java and makes HDFS API calls to read, write, append, hflush hadoop files. The tool would be usable standalone, or as a building block for other regression or stress test suites (written in shell, perl, python, etc).

      1. TestWriteRead.patch
        14 kB
        CW Chung
      2. TestWriteRead.patch
        14 kB
        CW Chung
      3. TestWriteRead.patch
        14 kB
        CW Chung
      4. TestWriteRead.java
        12 kB
        CW Chung

        Activity

        Hide
        Todd Lipcon added a comment -

        What you're describing sounds like TestDFSIO?

        Show
        Todd Lipcon added a comment - What you're describing sounds like TestDFSIO?
        Hide
        CW Chung added a comment -

        Good question. The differences are:
        1. This tool would be run as a vanilla client, not as map/reduce tasks. So you can test HDFS file operations without depending on another subsystem(map/reduce).
        2. The focus of the tool is not in the performance, but in the verification of functional correctness.

        Show
        CW Chung added a comment - Good question. The differences are: 1. This tool would be run as a vanilla client, not as map/reduce tasks. So you can test HDFS file operations without depending on another subsystem(map/reduce). 2. The focus of the tool is not in the performance, but in the verification of functional correctness.
        Hide
        Konstantin Boudnik added a comment -

        This sounds like a non-instrumented side of the Herriot that has been discussed a while ago on common-dev@ or general@. To me it sounds like an actual system test for HDFS rather than a special tool. Am I correct?

        Show
        Konstantin Boudnik added a comment - This sounds like a non-instrumented side of the Herriot that has been discussed a while ago on common-dev@ or general@. To me it sounds like an actual system test for HDFS rather than a special tool. Am I correct?
        Hide
        CW Chung added a comment -

        Yes, this is a system test for HDFS. It is meant to be a small, quick-to-implement test that can be run on a test or production cluster, without any special library or change in cluster configuration. Although it can be used as a building block for system regression tests, it is nowhere near the scope or complexity of Herriot or S-LIVE/DFSIO. So let's call it a test rather than a tool.

        Show
        CW Chung added a comment - Yes, this is a system test for HDFS. It is meant to be a small, quick-to-implement test that can be run on a test or production cluster, without any special library or change in cluster configuration. Although it can be used as a building block for system regression tests, it is nowhere near the scope or complexity of Herriot or S-LIVE/DFSIO. So let's call it a test rather than a tool.
        Hide
        Konstantin Boudnik added a comment -

        Thanks for the explanation, CW. A couple of things (questions):

        • is this one will be completely standalone (e.g. a separate jar file or added to hadoop-test.jar)?
        • of not, then you'll perhaps need the source code tree along with the build files and all to be present in order to run it?
        Show
        Konstantin Boudnik added a comment - Thanks for the explanation, CW. A couple of things (questions): is this one will be completely standalone (e.g. a separate jar file or added to hadoop-test.jar)? of not, then you'll perhaps need the source code tree along with the build files and all to be present in order to run it?
        Hide
        CW Chung added a comment -

        I would like to add to the hadoop-test.jar such that there is minimal hassle in using it. The source likely would be in hadoop-hdfs src/test/hdfs/org/apache/hadoop/hdfs.

        Show
        CW Chung added a comment - I would like to add to the hadoop-test.jar such that there is minimal hassle in using it. The source likely would be in hadoop-hdfs src/test/hdfs/org/apache/hadoop/hdfs.
        Hide
        Konstantin Boudnik added a comment -

        Unless you already have something in mind, I see a couple of an interesting approaches to solve this.

        • get some of existing functional tests and convert them to be usable on a real cluster (something line TestReadWhileWriting perhaps with append specific calls added)
        • HDFS-1762 would be another one worth think about. TestHDFSCLI won't apparently use programmatic APIs but will exercise HDFS trough hadoop starter.

        The advantage of the latter is that you can add new test cases by simply editing an xml config file and let the framework do the rest for you.

        Show
        Konstantin Boudnik added a comment - Unless you already have something in mind, I see a couple of an interesting approaches to solve this. get some of existing functional tests and convert them to be usable on a real cluster (something line TestReadWhileWriting perhaps with append specific calls added) HDFS-1762 would be another one worth think about. TestHDFSCLI won't apparently use programmatic APIs but will exercise HDFS trough hadoop starter. The advantage of the latter is that you can add new test cases by simply editing an xml config file and let the framework do the rest for you.
        Hide
        CW Chung added a comment -

        Yes, the first approach is closer to what I have in mind. I should be able to upload a patch for comment pretty soon.

        Thanks for pointing out HDFS-1762. The TestHDFSCLI looks to be an interesting exercise to make it run in a real cluster.

        Show
        CW Chung added a comment - Yes, the first approach is closer to what I have in mind. I should be able to upload a patch for comment pretty soon. Thanks for pointing out HDFS-1762 . The TestHDFSCLI looks to be an interesting exercise to make it run in a real cluster.
        Hide
        CW Chung added a comment -

        This test can be run both as a JUnit test and as a Standlone test in a real-cluster. This test will generate an exception (BlockMissingException).

        Show
        CW Chung added a comment - This test can be run both as a JUnit test and as a Standlone test in a real-cluster. This test will generate an exception (BlockMissingException).
        Hide
        Tsz Wo Nicholas Sze added a comment -

        Hi CW, please use "svn diff" to generate a patch.

        Show
        Tsz Wo Nicholas Sze added a comment - Hi CW, please use "svn diff" to generate a patch.
        Hide
        CW Chung added a comment -

        Thanks for the suggestion. Here is another patch with the following improvements:
        1. Use svn diff format
        2. System.out.println changed to LOG
        3. Use sequential read rather than position read. Junit Test will therefore pass.
        4. Removed unused import, declarations
        5. Test routine begin with name Test
        6. Add parsing of command line option

        Show
        CW Chung added a comment - Thanks for the suggestion. Here is another patch with the following improvements: 1. Use svn diff format 2. System.out.println changed to LOG 3. Use sequential read rather than position read. Junit Test will therefore pass. 4. Removed unused import, declarations 5. Test routine begin with name Test 6. Add parsing of command line option
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12479712/TestWriteRead.patch
        against trunk revision 1124459.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these core unit tests:
        org.apache.hadoop.cli.TestHDFSCLI
        org.apache.hadoop.hdfs.server.datanode.TestBlockRecovery
        org.apache.hadoop.hdfs.server.namenode.TestNodeCount
        org.apache.hadoop.hdfs.TestDFSStorageStateRecovery
        org.apache.hadoop.hdfs.TestFileConcurrentReader
        org.apache.hadoop.tools.TestJMXGet

        +1 contrib tests. The patch passed contrib unit tests.

        +1 system test framework. The patch passed system test framework compile.

        Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/579//testReport/
        Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/579//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/579//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12479712/TestWriteRead.patch against trunk revision 1124459. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.cli.TestHDFSCLI org.apache.hadoop.hdfs.server.datanode.TestBlockRecovery org.apache.hadoop.hdfs.server.namenode.TestNodeCount org.apache.hadoop.hdfs.TestDFSStorageStateRecovery org.apache.hadoop.hdfs.TestFileConcurrentReader org.apache.hadoop.tools.TestJMXGet +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/579//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/579//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/579//console This message is automatically generated.
        Hide
        Tsz Wo Nicholas Sze added a comment -
        • The variables, inJunitMode, BLOCK_SIZE, dfs, are not actually used. Please remove them.
        • How about the default filenameOption equals ROOT_DIR?
        • You may simply have static private Log LOG = LogFactory.getLog(TestWriteRead.class);
          +  static private Log LOG;
          +
          +  @Before
          +  public void initJunitModeTest() throws Exception {
          +    LOG = LogFactory.getLog(TestWriteRead.class);
          
        • Please remove the following. The default is already INFO.
          +    ((Log4JLogger) FSNamesystem.LOG).getLogger().setLevel(Level.INFO);
          +    ((Log4JLogger) DFSClient.LOG).getLogger().setLevel(Level.INFO);
          
        • Most public methods should be package private.
        • Please add comments to tell how to use the command options and the default values.
        Show
        Tsz Wo Nicholas Sze added a comment - The variables, inJunitMode , BLOCK_SIZE , dfs , are not actually used. Please remove them. How about the default filenameOption equals ROOT_DIR ? You may simply have static private Log LOG = LogFactory.getLog(TestWriteRead.class); + static private Log LOG; + + @Before + public void initJunitModeTest() throws Exception { + LOG = LogFactory.getLog(TestWriteRead.class); Please remove the following. The default is already INFO. + ((Log4JLogger) FSNamesystem.LOG).getLogger().setLevel(Level.INFO); + ((Log4JLogger) DFSClient.LOG).getLogger().setLevel(Level.INFO); Most public methods should be package private. Please add comments to tell how to use the command options and the default values.
        Hide
        CW Chung added a comment -

        Implement suggestions by Nicholas. Thanks!

        Show
        CW Chung added a comment - Implement suggestions by Nicholas. Thanks!
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12479831/TestWriteRead.patch
        against trunk revision 1125057.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these core unit tests:
        org.apache.hadoop.hdfs.TestDFSStorageStateRecovery
        org.apache.hadoop.hdfs.TestFileConcurrentReader
        org.apache.hadoop.hdfs.TestHDFSTrash

        +1 contrib tests. The patch passed contrib unit tests.

        +1 system test framework. The patch passed system test framework compile.

        Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/590//testReport/
        Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/590//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/590//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12479831/TestWriteRead.patch against trunk revision 1125057. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.TestDFSStorageStateRecovery org.apache.hadoop.hdfs.TestFileConcurrentReader org.apache.hadoop.hdfs.TestHDFSTrash +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/590//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/590//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/590//console This message is automatically generated.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        CW, please grant license to ASF for your latest patch.

        Show
        Tsz Wo Nicholas Sze added a comment - CW, please grant license to ASF for your latest patch.
        Hide
        CW Chung added a comment -

        Grant license to Apache. Otherwise, this version is the same as the one submitted 2 hours ago.

        Show
        CW Chung added a comment - Grant license to Apache. Otherwise, this version is the same as the one submitted 2 hours ago.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        +1 patch looks good

        Show
        Tsz Wo Nicholas Sze added a comment - +1 patch looks good
        Hide
        Tsz Wo Nicholas Sze added a comment -

        I have committed this. Thanks, CW!

        Show
        Tsz Wo Nicholas Sze added a comment - I have committed this. Thanks, CW!
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-trunk-Commit #677 (See https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/677/)

        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #677 (See https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/677/ )
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-trunk #673 (See https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/673/)

        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #673 (See https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/673/ )

          People

          • Assignee:
            CW Chung
            Reporter:
            CW Chung
          • Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development