Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-10415

TestDistributedFileSystem#MyDistributedFileSystem attempts to set up statistics before initialize() is called

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.8.0
    • Fix Version/s: 2.8.0, 3.0.0-alpha1
    • Component/s: test
    • Labels:
      None
    • Environment:

      jenkins

    • Target Version/s:

      Description

      Tests run: 24, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 51.096 sec <<< FAILURE! - in org.apache.hadoop.hdfs.TestDistributedFileSystem
      testDFSCloseOrdering(org.apache.hadoop.hdfs.TestDistributedFileSystem)  Time elapsed: 0.045 sec  <<< ERROR!
      java.lang.NullPointerException: null
      	at org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:790)
      	at org.apache.hadoop.fs.FileSystem.processDeleteOnExit(FileSystem.java:1417)
      	at org.apache.hadoop.fs.FileSystem.close(FileSystem.java:2084)
      	at org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:1187)
      	at org.apache.hadoop.hdfs.TestDistributedFileSystem.testDFSCloseOrdering(TestDistributedFileSystem.java:217)
      

      This is with Java 8 on Mac. It passes fine on trunk. I haven't tried other combinations.

      1. HDFS-10415-branch-2.000.patch
        2 kB
        Mingliang Liu
      2. HDFS-10415-branch-2.001.patch
        1 kB
        Mingliang Liu
      3. HDFS-10415.000.patch
        0.8 kB
        Mingliang Liu

        Issue Links

          Activity

          Hide
          liuml07 Mingliang Liu added a comment -

          Thank you Colin P. McCabe for your review and commit!

          Show
          liuml07 Mingliang Liu added a comment - Thank you Colin P. McCabe for your review and commit!
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Hadoop-trunk-Commit #9890 (See https://builds.apache.org/job/Hadoop-trunk-Commit/9890/)
          HDFS-10415. TestDistributedFileSystem#MyDistributedFileSystem attempts (cmccabe: rev 29d6cadc52e411990c8237fd2fa71257cea60d9a)

          • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDistributedFileSystem.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-trunk-Commit #9890 (See https://builds.apache.org/job/Hadoop-trunk-Commit/9890/ ) HDFS-10415 . TestDistributedFileSystem#MyDistributedFileSystem attempts (cmccabe: rev 29d6cadc52e411990c8237fd2fa71257cea60d9a) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDistributedFileSystem.java
          Hide
          cmccabe Colin P. McCabe added a comment -

          Committed to 2.8.

          Show
          cmccabe Colin P. McCabe added a comment - Committed to 2.8.
          Hide
          cmccabe Colin P. McCabe added a comment - - edited

          The subclass can change the configuration that gets passed to the superclass.

          class SuperClass {
            SuperClass(Configuration conf) {
              ... initialize superclass part of the object ...
            }
          }
          
          class SubClass extends SuperClass {
            SubClass(Configuration conf) {
              super(changeConf(conf));
              ... initialize my part of the object ...
            }
          
            private static Configuration changeConf(Configuration conf) {
              Configuration nconf = new Configuration(conf);
              nconf.set("foo", "bar");
              return nconf;
            }
          }
          

          Having a separate init() method is a well-known antipattern. Initialization belongs in the constructor. The only time a separate init method is really necessary is if you're using a dialect of C++ that doesn't support exceptions.

          Show
          cmccabe Colin P. McCabe added a comment - - edited The subclass can change the configuration that gets passed to the superclass. class SuperClass { SuperClass(Configuration conf) { ... initialize superclass part of the object ... } } class SubClass extends SuperClass { SubClass(Configuration conf) { super (changeConf(conf)); ... initialize my part of the object ... } private static Configuration changeConf(Configuration conf) { Configuration nconf = new Configuration(conf); nconf.set( "foo" , "bar" ); return nconf; } } Having a separate init() method is a well-known antipattern. Initialization belongs in the constructor. The only time a separate init method is really necessary is if you're using a dialect of C++ that doesn't support exceptions.
          Hide
          cmccabe Colin P. McCabe added a comment -

          It sounds like there are no strong objections to HDFS-10415.000.patch and HDFS-10415-branch-2.001.patch Let's fix this unit test!

          We can improve this in a follow-on JIRA (personally, I like the idea of adding the initialization to the init method). But it's not worth blocking the unit test fix.

          +1.

          Show
          cmccabe Colin P. McCabe added a comment - It sounds like there are no strong objections to HDFS-10415 .000.patch and HDFS-10415 -branch-2.001.patch Let's fix this unit test! We can improve this in a follow-on JIRA (personally, I like the idea of adding the initialization to the init method). But it's not worth blocking the unit test fix. +1.
          Hide
          liuml07 Mingliang Liu added a comment -

          I'm okay with reconciling with trunk if it fixes the problem.

          It does.

          Can anyone commit this patch to trunk and branch-2 if looks good? Thanks.

          Show
          liuml07 Mingliang Liu added a comment - I'm okay with reconciling with trunk if it fixes the problem. It does. Can anyone commit this patch to trunk and branch-2 if looks good? Thanks.
          Hide
          andrew.wang Andrew Wang added a comment -

          I think that's just an omission, though it's been 3 years. I'm okay with reconciling with trunk if it fixes the problem.

          Show
          andrew.wang Andrew Wang added a comment - I think that's just an omission, though it's been 3 years. I'm okay with reconciling with trunk if it fixes the problem.
          Hide
          liuml07 Mingliang Liu added a comment -

          If no objections about the 1st approach, does the v1 patch look good? Thanks!

          Show
          liuml07 Mingliang Liu added a comment - If no objections about the 1st approach, does the v1 patch look good? Thanks!
          Hide
          stevel@apache.org Steve Loughran added a comment -

          actually, there's a good argument for having the configuration binding being done in an init() method; the Yarn Service model does the same thing: it lets you subclass properly.

          Any FS can subclass initialize(), patch up the config before calling the superclass, do something more after, all knowing that the rest of the overridden methods will get called. You cannot do things like that in constructors

          Show
          stevel@apache.org Steve Loughran added a comment - actually, there's a good argument for having the configuration binding being done in an init() method; the Yarn Service model does the same thing: it lets you subclass properly . Any FS can subclass initialize(), patch up the config before calling the superclass, do something more after, all knowing that the rest of the overridden methods will get called. You cannot do things like that in constructors
          Hide
          sjlee0 Sangjin Lee added a comment -

          #1 sounds like the right thing to do. Andrew Wang, was there a reason that MyDistributedFileSystem#delete() was left out in branch-2?

          Show
          sjlee0 Sangjin Lee added a comment - #1 sounds like the right thing to do. Andrew Wang , was there a reason that MyDistributedFileSystem#delete() was left out in branch-2?
          Hide
          liuml07 Mingliang Liu added a comment -

          When HADOOP-9418 was committed to trunk and branch-2 the overridden MyDistributedFileSystem#delete() was not in branch-2, which is the only difference between the two commits. I don't have too much context about this, and boldly assume this as an omission.

          As this case is for test only and hardly shows up in real systems, I don't have strong preference (which is listed though) regarding the possible fixes we discussed here.

          1. Make MyDistributedFileSystem override delete() in branch-2 as in trunk (see patch v1)
          2. Spy instead of mocking (see patch v0) for both trunk and branch-2
          3. Construct storageStatistics as we construct statistics
          Show
          liuml07 Mingliang Liu added a comment - When HADOOP-9418 was committed to trunk and branch-2 the overridden MyDistributedFileSystem#delete() was not in branch-2 , which is the only difference between the two commits. I don't have too much context about this, and boldly assume this as an omission. As this case is for test only and hardly shows up in real systems, I don't have strong preference (which is listed though) regarding the possible fixes we discussed here. Make MyDistributedFileSystem override delete() in branch-2 as in trunk (see patch v1) Spy instead of mocking (see patch v0) for both trunk and branch-2 Construct storageStatistics as we construct statistics
          Hide
          cmccabe Colin P. McCabe added a comment -

          As Steve Loughran's concern, if the stats has nothing to do with this unit test, we can consider avoiding it. I'm more favor of this approach.

          Sure. Thanks for the explanation.

          there's another option, you know. Do the stats init in the constructor rather than initialize. There is no information used in setting up DFSClient.storageStatistics, its only ever written to once. Move it to the constructor and make final and maybe this problem will go away (maybe, mocks are a PITA)

          It seems like this would prevent us from using the Configuration object in the future when creating stats, right? I think we should keep this flexibility.

          This whole problem arises because the FileSystem constructor doesn't require a Configuration and it should, which leads to the "construct then initialize" idiom. If it just took a Configuration in the first place we could initialize everything in the constructor. grumble grumble

          Show
          cmccabe Colin P. McCabe added a comment - As Steve Loughran's concern, if the stats has nothing to do with this unit test, we can consider avoiding it. I'm more favor of this approach. Sure. Thanks for the explanation. there's another option, you know. Do the stats init in the constructor rather than initialize. There is no information used in setting up DFSClient.storageStatistics, its only ever written to once. Move it to the constructor and make final and maybe this problem will go away (maybe, mocks are a PITA) It seems like this would prevent us from using the Configuration object in the future when creating stats, right? I think we should keep this flexibility. This whole problem arises because the FileSystem constructor doesn't require a Configuration and it should, which leads to the "construct then initialize" idiom. If it just took a Configuration in the first place we could initialize everything in the constructor. grumble grumble
          Hide
          stevel@apache.org Steve Loughran added a comment -

          there's another option, you know. Do the stats init in the constructor rather than initialize. There is no information used in setting up DFSClient.storageStatistics, its only ever written to once. Move it to the constructor and make final and maybe this problem will go away (maybe, mocks are a PITA)

          Show
          stevel@apache.org Steve Loughran added a comment - there's another option, you know. Do the stats init in the constructor rather than initialize. There is no information used in setting up DFSClient.storageStatistics , its only ever written to once. Move it to the constructor and make final and maybe this problem will go away (maybe, mocks are a PITA)
          Hide
          liuml07 Mingliang Liu added a comment -

          Hi Colin P. McCabe,

          Thanks for the comment. Your proposal is a bit similar to my solution 2 (see the comment above).

          The problem is that, though it's indeed more nature fix, the initialize() method is never called after MyDistributedFileSystem is constructed. This is because, the object is not created by the static factory method FileSystem#get(). As a result, justing implementing the initialize() is not able to work. We have to call it.
          As to implementation,

          1. the dfs object is mocked in the MyDistributedFileSystem constructor. The DistributedFileSystem#initialize() method will reset this value, which is expected generally. We need to mock the dfs in the initialize() after calling super.initialize()
          2. super.initialize() will take care of the statistics and storageStatistics so we don't need to create it explicitly after that I believe?

          As Steve Loughran's concern, if the stats has nothing to do with this unit test, we can consider avoiding it. I'm more favor of this approach.

          My concern is that, why the trunk and branch-2 have the code difference as following? If the javadoc is true, it should stand for both of the branches. I must have missed something?

          +    // Symlink resolution doesn't work with a mock, since it doesn't
          +    // have a valid Configuration to resolve paths to the right FileSystem.
          +    // Just call the DFSClient directly to register the delete
          +    @Override
          +    public boolean delete(Path f, final boolean recursive) throws IOException {
          +      return dfs.delete(f.toUri().getPath(), recursive);
          +    }
          
          Show
          liuml07 Mingliang Liu added a comment - Hi Colin P. McCabe , Thanks for the comment. Your proposal is a bit similar to my solution 2 (see the comment above). The problem is that, though it's indeed more nature fix, the initialize() method is never called after MyDistributedFileSystem is constructed. This is because, the object is not created by the static factory method FileSystem#get() . As a result, justing implementing the initialize() is not able to work. We have to call it. As to implementation, the dfs object is mocked in the MyDistributedFileSystem constructor. The DistributedFileSystem#initialize() method will reset this value, which is expected generally. We need to mock the dfs in the initialize() after calling super.initialize() super.initialize() will take care of the statistics and storageStatistics so we don't need to create it explicitly after that I believe? As Steve Loughran 's concern, if the stats has nothing to do with this unit test, we can consider avoiding it. I'm more favor of this approach. My concern is that, why the trunk and branch-2 have the code difference as following? If the javadoc is true, it should stand for both of the branches. I must have missed something? + // Symlink resolution doesn't work with a mock, since it doesn't + // have a valid Configuration to resolve paths to the right FileSystem. + // Just call the DFSClient directly to register the delete + @Override + public boolean delete(Path f, final boolean recursive) throws IOException { + return dfs.delete(f.toUri().getPath(), recursive); + }
          Hide
          cmccabe Colin P. McCabe added a comment -

          Thanks for looking at this.

          So basically the problem is that we're attempting to do something in the constructor of our DistributedFileSystem subclass that requires that the FS already be initialized. Why not just override the initialize method with something like:

          public void initialize() {
            super.initialize();
            statistics = new FileSystem.Statistics("myhdfs"); // can't mock finals
          }
          

          That seems like the most natural fix since it's not doing "weird stuff" that we don't do outside unit tests.

          I don't feel strongly about this, though, any of the solutions proposed here seems like it would work.

          Show
          cmccabe Colin P. McCabe added a comment - Thanks for looking at this. So basically the problem is that we're attempting to do something in the constructor of our DistributedFileSystem subclass that requires that the FS already be initialized. Why not just override the initialize method with something like: public void initialize() { super .initialize(); statistics = new FileSystem.Statistics( "myhdfs" ); // can't mock finals } That seems like the most natural fix since it's not doing "weird stuff" that we don't do outside unit tests. I don't feel strongly about this, though, any of the solutions proposed here seems like it would work.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 15s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          +1 mvninstall 7m 22s trunk passed
          +1 compile 0m 43s trunk passed
          +1 checkstyle 0m 26s trunk passed
          +1 mvnsite 0m 55s trunk passed
          +1 mvneclipse 0m 14s trunk passed
          +1 findbugs 1m 43s trunk passed
          +1 javadoc 1m 8s trunk passed
          +1 mvninstall 0m 48s the patch passed
          +1 compile 0m 44s the patch passed
          +1 javac 0m 44s the patch passed
          +1 checkstyle 0m 25s the patch passed
          +1 mvnsite 0m 49s the patch passed
          +1 mvneclipse 0m 10s the patch passed
          +1 whitespace 0m 0s Patch has no whitespace issues.
          +1 findbugs 1m 56s the patch passed
          +1 javadoc 1m 2s the patch passed
          -1 unit 72m 34s hadoop-hdfs in the patch failed.
          +1 asflicense 0m 19s Patch does not generate ASF License warnings.
          92m 51s



          Reason Tests
          Failed junit tests hadoop.hdfs.shortcircuit.TestShortCircuitCache
            hadoop.hdfs.web.TestWebHDFS
            hadoop.hdfs.TestAsyncDFSRename



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:2c91fd8
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12804557/HDFS-10415.000.patch
          JIRA Issue HDFS-10415
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux b81258dbd051 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 8a9ecb7
          Default Java 1.8.0_91
          findbugs v3.0.0
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/15474/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          unit test logs https://builds.apache.org/job/PreCommit-HDFS-Build/15474/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/15474/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/15474/console
          Powered by Apache Yetus 0.2.0 http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 15s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 7m 22s trunk passed +1 compile 0m 43s trunk passed +1 checkstyle 0m 26s trunk passed +1 mvnsite 0m 55s trunk passed +1 mvneclipse 0m 14s trunk passed +1 findbugs 1m 43s trunk passed +1 javadoc 1m 8s trunk passed +1 mvninstall 0m 48s the patch passed +1 compile 0m 44s the patch passed +1 javac 0m 44s the patch passed +1 checkstyle 0m 25s the patch passed +1 mvnsite 0m 49s the patch passed +1 mvneclipse 0m 10s the patch passed +1 whitespace 0m 0s Patch has no whitespace issues. +1 findbugs 1m 56s the patch passed +1 javadoc 1m 2s the patch passed -1 unit 72m 34s hadoop-hdfs in the patch failed. +1 asflicense 0m 19s Patch does not generate ASF License warnings. 92m 51s Reason Tests Failed junit tests hadoop.hdfs.shortcircuit.TestShortCircuitCache   hadoop.hdfs.web.TestWebHDFS   hadoop.hdfs.TestAsyncDFSRename Subsystem Report/Notes Docker Image:yetus/hadoop:2c91fd8 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12804557/HDFS-10415.000.patch JIRA Issue HDFS-10415 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux b81258dbd051 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 8a9ecb7 Default Java 1.8.0_91 findbugs v3.0.0 unit https://builds.apache.org/job/PreCommit-HDFS-Build/15474/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt unit test logs https://builds.apache.org/job/PreCommit-HDFS-Build/15474/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/15474/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/15474/console Powered by Apache Yetus 0.2.0 http://yetus.apache.org This message was automatically generated.
          Hide
          hadoopqa Hadoop QA added a comment -
          +1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 24s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          +1 mvninstall 8m 27s trunk passed
          +1 compile 0m 56s trunk passed
          +1 checkstyle 0m 26s trunk passed
          +1 mvnsite 0m 51s trunk passed
          +1 mvneclipse 0m 12s trunk passed
          +1 findbugs 1m 42s trunk passed
          +1 javadoc 1m 10s trunk passed
          +1 mvninstall 0m 49s the patch passed
          +1 compile 0m 44s the patch passed
          +1 javac 0m 44s the patch passed
          +1 checkstyle 0m 25s the patch passed
          +1 mvnsite 0m 52s the patch passed
          +1 mvneclipse 0m 10s the patch passed
          +1 whitespace 0m 1s Patch has no whitespace issues.
          +1 findbugs 1m 47s the patch passed
          +1 javadoc 1m 8s the patch passed
          +1 unit 72m 48s hadoop-hdfs in the patch passed.
          +1 asflicense 0m 21s Patch does not generate ASF License warnings.
          94m 37s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:2c91fd8
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12804557/HDFS-10415.000.patch
          JIRA Issue HDFS-10415
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux be1621e007c2 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 8a9ecb7
          Default Java 1.8.0_91
          findbugs v3.0.0
          Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/15468/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/15468/console
          Powered by Apache Yetus 0.2.0 http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 reexec 0m 24s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 8m 27s trunk passed +1 compile 0m 56s trunk passed +1 checkstyle 0m 26s trunk passed +1 mvnsite 0m 51s trunk passed +1 mvneclipse 0m 12s trunk passed +1 findbugs 1m 42s trunk passed +1 javadoc 1m 10s trunk passed +1 mvninstall 0m 49s the patch passed +1 compile 0m 44s the patch passed +1 javac 0m 44s the patch passed +1 checkstyle 0m 25s the patch passed +1 mvnsite 0m 52s the patch passed +1 mvneclipse 0m 10s the patch passed +1 whitespace 0m 1s Patch has no whitespace issues. +1 findbugs 1m 47s the patch passed +1 javadoc 1m 8s the patch passed +1 unit 72m 48s hadoop-hdfs in the patch passed. +1 asflicense 0m 21s Patch does not generate ASF License warnings. 94m 37s Subsystem Report/Notes Docker Image:yetus/hadoop:2c91fd8 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12804557/HDFS-10415.000.patch JIRA Issue HDFS-10415 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux be1621e007c2 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 8a9ecb7 Default Java 1.8.0_91 findbugs v3.0.0 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/15468/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/15468/console Powered by Apache Yetus 0.2.0 http://yetus.apache.org This message was automatically generated.
          Hide
          liuml07 Mingliang Liu added a comment -

          Sangjin Lee, interesting question. I see code difference in MyDistributedFileSystem between trunk and branch-2:

          +    // Symlink resolution doesn't work with a mock, since it doesn't
          +    // have a valid Configuration to resolve paths to the right FileSystem.
          +    // Just call the DFSClient directly to register the delete
          +    @Override
          +    public boolean delete(Path f, final boolean recursive) throws IOException {
          +      return dfs.delete(f.toUri().getPath(), recursive);
          +    }
          

          Please note that the dfs actually is a DFSClient object. In the overridden delete() method, it calls the DFSClient#delete() directly, bypassing all the stats related operations. That's said, we can even remove the statement that creating statistics object in the MyDistributedFileSystem.

          I think we can follow the trunk code in test, in which way initialize() and stats will be avoided elegantly as it's unrelated to the test. This will address Steve Loughran's concern.

          Show
          liuml07 Mingliang Liu added a comment - Sangjin Lee , interesting question. I see code difference in MyDistributedFileSystem between trunk and branch-2 : + // Symlink resolution doesn't work with a mock, since it doesn't + // have a valid Configuration to resolve paths to the right FileSystem. + // Just call the DFSClient directly to register the delete + @Override + public boolean delete(Path f, final boolean recursive) throws IOException { + return dfs.delete(f.toUri().getPath(), recursive); + } Please note that the dfs actually is a DFSClient object. In the overridden delete() method, it calls the DFSClient#delete() directly, bypassing all the stats related operations. That's said, we can even remove the statement that creating statistics object in the MyDistributedFileSystem . I think we can follow the trunk code in test, in which way initialize() and stats will be avoided elegantly as it's unrelated to the test. This will address Steve Loughran 's concern.
          Hide
          sjlee0 Sangjin Lee added a comment -

          Thanks for the analysis Mingliang Liu. Do you know why the trunk version of the test does not fail?

          Show
          sjlee0 Sangjin Lee added a comment - Thanks for the analysis Mingliang Liu . Do you know why the trunk version of the test does not fail?
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 13m 0s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          +1 mvninstall 9m 10s branch-2 passed
          +1 compile 0m 38s branch-2 passed with JDK v1.8.0_91
          +1 compile 0m 42s branch-2 passed with JDK v1.7.0_101
          +1 checkstyle 0m 32s branch-2 passed
          +1 mvnsite 0m 54s branch-2 passed
          +1 mvneclipse 0m 22s branch-2 passed
          -1 findbugs 2m 8s hadoop-hdfs-project/hadoop-hdfs in branch-2 has 1 extant Findbugs warnings.
          +1 javadoc 1m 6s branch-2 passed with JDK v1.8.0_91
          +1 javadoc 1m 46s branch-2 passed with JDK v1.7.0_101
          +1 mvninstall 0m 44s the patch passed
          +1 compile 0m 37s the patch passed with JDK v1.8.0_91
          +1 javac 0m 37s the patch passed
          +1 compile 0m 40s the patch passed with JDK v1.7.0_101
          +1 javac 0m 40s the patch passed
          -1 checkstyle 0m 25s hadoop-hdfs-project/hadoop-hdfs: patch generated 1 new + 63 unchanged - 0 fixed = 64 total (was 63)
          +1 mvnsite 0m 47s the patch passed
          +1 mvneclipse 0m 11s the patch passed
          +1 whitespace 0m 0s Patch has no whitespace issues.
          +1 findbugs 2m 8s the patch passed
          +1 javadoc 1m 5s the patch passed with JDK v1.8.0_91
          +1 javadoc 1m 49s the patch passed with JDK v1.7.0_101
          -1 unit 50m 8s hadoop-hdfs in the patch failed with JDK v1.8.0_91.
          -1 unit 47m 50s hadoop-hdfs in the patch failed with JDK v1.7.0_101.
          +1 asflicense 0m 21s Patch does not generate ASF License warnings.
          139m 8s



          Reason Tests
          JDK v1.8.0_91 Failed junit tests hadoop.metrics2.sink.TestRollingFileSystemSinkWithHdfs
          JDK v1.7.0_101 Failed junit tests hadoop.metrics2.sink.TestRollingFileSystemSinkWithHdfs



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:babe025
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12804489/HDFS-10415-branch-2.000.patch
          JIRA Issue HDFS-10415
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 893b6904bbf8 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision branch-2 / 6ebb92c
          Default Java 1.7.0_101
          Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_91 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_101
          findbugs v3.0.0
          findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/15465/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html
          checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/15465/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/15465/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_91.txt
          unit https://builds.apache.org/job/PreCommit-HDFS-Build/15465/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_101.txt
          unit test logs https://builds.apache.org/job/PreCommit-HDFS-Build/15465/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_91.txt https://builds.apache.org/job/PreCommit-HDFS-Build/15465/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_101.txt
          JDK v1.7.0_101 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/15465/testReport/
          modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
          Console output https://builds.apache.org/job/PreCommit-HDFS-Build/15465/console
          Powered by Apache Yetus 0.2.0 http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 13m 0s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 9m 10s branch-2 passed +1 compile 0m 38s branch-2 passed with JDK v1.8.0_91 +1 compile 0m 42s branch-2 passed with JDK v1.7.0_101 +1 checkstyle 0m 32s branch-2 passed +1 mvnsite 0m 54s branch-2 passed +1 mvneclipse 0m 22s branch-2 passed -1 findbugs 2m 8s hadoop-hdfs-project/hadoop-hdfs in branch-2 has 1 extant Findbugs warnings. +1 javadoc 1m 6s branch-2 passed with JDK v1.8.0_91 +1 javadoc 1m 46s branch-2 passed with JDK v1.7.0_101 +1 mvninstall 0m 44s the patch passed +1 compile 0m 37s the patch passed with JDK v1.8.0_91 +1 javac 0m 37s the patch passed +1 compile 0m 40s the patch passed with JDK v1.7.0_101 +1 javac 0m 40s the patch passed -1 checkstyle 0m 25s hadoop-hdfs-project/hadoop-hdfs: patch generated 1 new + 63 unchanged - 0 fixed = 64 total (was 63) +1 mvnsite 0m 47s the patch passed +1 mvneclipse 0m 11s the patch passed +1 whitespace 0m 0s Patch has no whitespace issues. +1 findbugs 2m 8s the patch passed +1 javadoc 1m 5s the patch passed with JDK v1.8.0_91 +1 javadoc 1m 49s the patch passed with JDK v1.7.0_101 -1 unit 50m 8s hadoop-hdfs in the patch failed with JDK v1.8.0_91. -1 unit 47m 50s hadoop-hdfs in the patch failed with JDK v1.7.0_101. +1 asflicense 0m 21s Patch does not generate ASF License warnings. 139m 8s Reason Tests JDK v1.8.0_91 Failed junit tests hadoop.metrics2.sink.TestRollingFileSystemSinkWithHdfs JDK v1.7.0_101 Failed junit tests hadoop.metrics2.sink.TestRollingFileSystemSinkWithHdfs Subsystem Report/Notes Docker Image:yetus/hadoop:babe025 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12804489/HDFS-10415-branch-2.000.patch JIRA Issue HDFS-10415 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 893b6904bbf8 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision branch-2 / 6ebb92c Default Java 1.7.0_101 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_91 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_101 findbugs v3.0.0 findbugs https://builds.apache.org/job/PreCommit-HDFS-Build/15465/artifact/patchprocess/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html checkstyle https://builds.apache.org/job/PreCommit-HDFS-Build/15465/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/15465/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_91.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/15465/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_101.txt unit test logs https://builds.apache.org/job/PreCommit-HDFS-Build/15465/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_91.txt https://builds.apache.org/job/PreCommit-HDFS-Build/15465/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_101.txt JDK v1.7.0_101 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/15465/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/15465/console Powered by Apache Yetus 0.2.0 http://yetus.apache.org This message was automatically generated.
          Hide
          liuml07 Mingliang Liu added a comment -
          1. If we call the initialize() method in the test, it will pass. Of course we have to do this before mocking the dfsclient object.
            Solution 2 - calling initialize() explicitly
               private static class MyDistributedFileSystem extends DistributedFileSystem {
                 MyDistributedFileSystem() {
            -      statistics = new FileSystem.Statistics("myhdfs"); // can't mock finals
            +      initialize(new URI("hdfs://localhost"), new HdfsConfiguration()); // exception may be thrown
                   dfs = mock(DFSClient.class);
                 }
            
          2. DistributedFileSystem#close() per se is not resilient to invocation prior to initialize() being called. Meanwhile, the MyDistributedFileSystem also has to create a Statistics object explicitly which is partial of what the initialize() method does. To me, this is not ideal. It also has to trick out the deleteOnExit() by returning true for any path. I'm more comfortable by simply using a real DFS object, and validating the order of implicit operations when closing.
          3. Anyway, we can avoid calling initialize() method by constructing the storageStatistics object in the MyDistributedFileSystem as following:
            Solution 3 - constructing the storageStatistics
                 MyDistributedFileSystem() {
                   statistics = new FileSystem.Statistics("myhdfs"); // can't mock finals
            +      storageStatistics = new DFSOpsCountStatistics(); // field needs to be protected
                   dfs = mock(DFSClient.class);
                 }
            
          Show
          liuml07 Mingliang Liu added a comment - If we call the initialize() method in the test, it will pass. Of course we have to do this before mocking the dfsclient object. Solution 2 - calling initialize() explicitly private static class MyDistributedFileSystem extends DistributedFileSystem { MyDistributedFileSystem() { - statistics = new FileSystem.Statistics( "myhdfs" ); // can't mock finals + initialize( new URI( "hdfs: //localhost" ), new HdfsConfiguration()); // exception may be thrown dfs = mock(DFSClient.class); } DistributedFileSystem#close() per se is not resilient to invocation prior to initialize() being called. Meanwhile, the MyDistributedFileSystem also has to create a Statistics object explicitly which is partial of what the initialize() method does. To me, this is not ideal. It also has to trick out the deleteOnExit() by returning true for any path. I'm more comfortable by simply using a real DFS object, and validating the order of implicit operations when closing. Anyway, we can avoid calling initialize() method by constructing the storageStatistics object in the MyDistributedFileSystem as following: Solution 3 - constructing the storageStatistics MyDistributedFileSystem() { statistics = new FileSystem.Statistics( "myhdfs" ); // can't mock finals + storageStatistics = new DFSOpsCountStatistics(); // field needs to be protected dfs = mock(DFSClient.class); }
          Hide
          stevel@apache.org Steve Loughran added a comment -

          we could also consider having dfs.close() being resilient to invocation prior to initialize() being called. Clearly, it used to be. Given that delete() is being called in teardown, why not have that skip the counting.

          of course, there's the situation that normally, you wouldn't have any files to delete in an FS that was never inited —in which case this problem never arises in the real system.

          What happens if initialize() is called in the test?

          Show
          stevel@apache.org Steve Loughran added a comment - we could also consider having dfs.close() being resilient to invocation prior to initialize() being called. Clearly, it used to be. Given that delete() is being called in teardown, why not have that skip the counting. of course, there's the situation that normally, you wouldn't have any files to delete in an FS that was never inited —in which case this problem never arises in the real system. What happens if initialize() is called in the test?
          Hide
          liuml07 Mingliang Liu added a comment -

          If the patch looks good, I think we can apply it to trunk branch as well. Thanks.

          Show
          liuml07 Mingliang Liu added a comment - If the patch looks good, I think we can apply it to trunk branch as well. Thanks.
          Hide
          liuml07 Mingliang Liu added a comment -

          Thanks for reporting this Sangjin Lee and Steve Loughran.

          I think the problem was that, we added a new StorageStatistics object in HADOOP-13065 to DistributedFileSystem and it needs initialized in the initialize() method. In the test, we simply create a new instance by calling constructor, in which way the initialize() method is not called, instead of using the factory methods like FileSystem#get().

          As a fix, I see two possibilities:

          1. Call the initialize() method explicitly before mocking the dfs field. This way, the newly added StorageStatistics object will be initialized before using it.
          2. For the InOrder unit test, actually we can use the spied objects other than mocked objects. This way, we don't need to create our test file system MyDistributedFileSystem.

          I prefer the 2nd option as v0 patch does.

          Show
          liuml07 Mingliang Liu added a comment - Thanks for reporting this Sangjin Lee and Steve Loughran . I think the problem was that, we added a new StorageStatistics object in HADOOP-13065 to DistributedFileSystem and it needs initialized in the initialize() method. In the test, we simply create a new instance by calling constructor, in which way the initialize() method is not called, instead of using the factory methods like FileSystem#get() . As a fix, I see two possibilities: Call the initialize() method explicitly before mocking the dfs field. This way, the newly added StorageStatistics object will be initialized before using it. For the InOrder unit test, actually we can use the spied objects other than mocked objects. This way, we don't need to create our test file system MyDistributedFileSystem . I prefer the 2nd option as v0 patch does.
          Hide
          stevel@apache.org Steve Loughran added a comment -

          happens on Jenkins too

          Show
          stevel@apache.org Steve Loughran added a comment - happens on Jenkins too

            People

            • Assignee:
              liuml07 Mingliang Liu
              Reporter:
              sjlee0 Sangjin Lee
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development