Hadoop Common
  1. Hadoop Common
  2. HADOOP-5687

Hadoop NameNode throws NPE if fs.default.name is the default value

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.21.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Throwing NPE is confusing; instead, an exception with a useful string description could be thrown instead.

      1. HADOOP-5687.patch
        3 kB
        Philip Zeyliger
      2. HADOOP-5687.patch
        3 kB
        Philip Zeyliger
      3. HADOOP-5687-v3.patch
        5 kB
        Philip Zeyliger
      4. HADOOP-5687-v4.patch
        5 kB
        Philip Zeyliger
      5. HADOOP-5687-v5.patch
        5 kB
        Philip Zeyliger
      6. HADOOP-5687-v5.patch
        5 kB
        Konstantin Shvachko
      7. HADOOP-5687-extension-1.patch
        4 kB
        steve_l
      8. HADOOP-5687-extension-2.patch
        7 kB
        steve_l

        Issue Links

          Activity

          Hide
          Philip Zeyliger added a comment -

          Hadoop should strive to avoid showing NPEs to the user for bad configurations. The attached patch changes:

          09/04/15 17:40:10 ERROR namenode.NameNode: java.lang.NullPointerException
          at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:135)
          at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:172)
          at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:176)
          at org.apache.hadoop.hdfs.server.namenode.NameNode.getRpcServerAddress(NameNode.java:204)
          at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:242)
          at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:373)
          at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:367)
          at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1121)
          at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1130)
          

          to:

          09/04/15 17:41:31 ERROR namenode.NameNode: java.lang.IllegalArgumentException: Invalid URI for NameNode address (check fs.default.name): file:/// has no authority
          at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:180)
          at org.apache.hadoop.hdfs.server.namenode.NameNode.getRpcServerAddress(NameNode.java:219)
          at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:257)
          at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:388)
          at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:382)
          at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1136)
          at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1145)
          

          It also checks that the namenode is started with URI's that have the "hdfs://" scheme.

          If people don't mind the patch, I'll add a test.

          Show
          Philip Zeyliger added a comment - Hadoop should strive to avoid showing NPEs to the user for bad configurations. The attached patch changes: 09/04/15 17:40:10 ERROR namenode.NameNode: java.lang.NullPointerException at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:135) at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:172) at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:176) at org.apache.hadoop.hdfs.server.namenode.NameNode.getRpcServerAddress(NameNode.java:204) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:242) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:373) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:367) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1121) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1130) to: 09/04/15 17:41:31 ERROR namenode.NameNode: java.lang.IllegalArgumentException: Invalid URI for NameNode address (check fs.default.name): file:/// has no authority at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:180) at org.apache.hadoop.hdfs.server.namenode.NameNode.getRpcServerAddress(NameNode.java:219) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:257) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:388) at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:382) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1136) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1145) It also checks that the namenode is started with URI's that have the "hdfs://" scheme. If people don't mind the patch, I'll add a test.
          Hide
          Suresh Srinivas added a comment -
          1. Make DFS_URI_SCHEME final
          2. NameNode.getAddress()
            • use authority that you already have to return as getAddress(authority)
            • use FS_DEFAULT_NAME_KEY instead of hardcoded conf name "fs.default.name"
          3. Hard coded "hdfs" can be replaced with DFS_URI_SCHEME in SecondaryNameNode.getInforServer()
          Show
          Suresh Srinivas added a comment - Make DFS_URI_SCHEME final NameNode.getAddress() use authority that you already have to return as getAddress(authority) use FS_DEFAULT_NAME_KEY instead of hardcoded conf name "fs.default.name" Hard coded "hdfs" can be replaced with DFS_URI_SCHEME in SecondaryNameNode.getInforServer()
          Hide
          Philip Zeyliger added a comment -

          1. Make DFS_URI_SCHEME final

          Done.

          2. NameNode.getAddress()

          * use authority that you already have to return as getAddress(authority)

          Oops, done.

          * use FS_DEFAULT_NAME_KEY instead of hardcoded conf name "fs.default.name"

          Done. I had to make FS_DEFAULT_NAME_KEY public.

          3. Hard coded "hdfs" can be replaced with DFS_URI_SCHEME in SecondaryNameNode.getInforServer()

          Done.

          Thanks for the suggestions! I've uploaded a new patch.

          Show
          Philip Zeyliger added a comment - 1. Make DFS_URI_SCHEME final Done. 2. NameNode.getAddress() * use authority that you already have to return as getAddress(authority) Oops, done. * use FS_DEFAULT_NAME_KEY instead of hardcoded conf name "fs.default.name" Done. I had to make FS_DEFAULT_NAME_KEY public. 3. Hard coded "hdfs" can be replaced with DFS_URI_SCHEME in SecondaryNameNode.getInforServer() Done. Thanks for the suggestions! I've uploaded a new patch.
          Hide
          Konstantin Shvachko added a comment -

          Philip,

          1. I think NPE related to authority = null should be handled in NetUtils.createSocketAddr(). This is a general problem not just name-node's.
          2. Checking that URI has the hdfs scheme is basically an assert, otherwise this would not be called. But if you want it, could you please
            • use HDFS_URI_SCHEME instead of DFS_URI_SCHEME,
            • and place it in FSContants rather than DistributedFileSystem
          Show
          Konstantin Shvachko added a comment - Philip, I think NPE related to authority = null should be handled in NetUtils.createSocketAddr() . This is a general problem not just name-node's. Checking that URI has the hdfs scheme is basically an assert, otherwise this would not be called. But if you want it, could you please use HDFS_URI_SCHEME instead of DFS_URI_SCHEME , and place it in FSContants rather than DistributedFileSystem
          Hide
          Philip Zeyliger added a comment -

          Hi Konstantin,

          Thanks for taking a look at this.

          I think NPE related to authority = null should be handled in NetUtils.createSocketAddr(). This is a general problem not just name-node's.

          The best I could do in createSocketAddr() is throw IllegalArgumentException, instead of NPE. (For readability, I like Preconditions.checkNotNull() from http://google-collections.googlecode.com/svn/trunk/javadoc/com/google/common/base/Preconditions.html#checkNotNull(T) (Google's Collections Library), but "if (x == null)

          { throw new IllegalArgumentException("..."); }

          " is the same. That's reasonable, but not terribly useful by itself. What I'd like to do is to give the user a good error message.

          Would you prefer that I add the check to throw IllegalArgumentException in NetUtils? I feel like I'd have to keep an if statement (or catch the IllegalArgumentException) in NameNode to keep the nice error message.

          Checking that URI has the hdfs scheme is basically an assert, otherwise this would not be called.

          I don't think Hadoop checks for "hdfs" higher in the stack; I set fs.default.name to hdXXXXfs://doorstop.local/, and the check I proposed in the patch was triggered. I take the view that assertions should be statements that the programmer feels must be true if the program has no bugs. Just like a compiler doesn't throw an assertion for a syntax error in your program, Hadoop shouldn't assert for a configuration error. (I'm thinking of the first definition in http://c2.com/cgi/wiki?WhatAreAssertions).

          use HDFS_URI_SCHEME instead of DFS_URI_SCHEME

          Sure; done.

          and place it in FSContants rather than DistributedFileSystem

          Sure; done.

          Show
          Philip Zeyliger added a comment - Hi Konstantin, Thanks for taking a look at this. I think NPE related to authority = null should be handled in NetUtils.createSocketAddr(). This is a general problem not just name-node's. The best I could do in createSocketAddr() is throw IllegalArgumentException, instead of NPE. (For readability, I like Preconditions.checkNotNull() from http://google-collections.googlecode.com/svn/trunk/javadoc/com/google/common/base/Preconditions.html#checkNotNull(T ) (Google's Collections Library), but "if (x == null) { throw new IllegalArgumentException("..."); } " is the same. That's reasonable, but not terribly useful by itself. What I'd like to do is to give the user a good error message. Would you prefer that I add the check to throw IllegalArgumentException in NetUtils? I feel like I'd have to keep an if statement (or catch the IllegalArgumentException) in NameNode to keep the nice error message. Checking that URI has the hdfs scheme is basically an assert, otherwise this would not be called. I don't think Hadoop checks for "hdfs" higher in the stack; I set fs.default.name to hdXXXXfs://doorstop.local/, and the check I proposed in the patch was triggered. I take the view that assertions should be statements that the programmer feels must be true if the program has no bugs. Just like a compiler doesn't throw an assertion for a syntax error in your program, Hadoop shouldn't assert for a configuration error. (I'm thinking of the first definition in http://c2.com/cgi/wiki?WhatAreAssertions ). use HDFS_URI_SCHEME instead of DFS_URI_SCHEME Sure; done. and place it in FSContants rather than DistributedFileSystem Sure; done.
          Hide
          Konstantin Shvachko added a comment -

          > Would you prefer that I add the check to throw IllegalArgumentException in NetUtils?

          Yes, because it will work for other modules like JobTracker as well as the name-node.

          > I feel like I'd have to keep an if statement (or catch the IllegalArgumentException) in NameNode to keep the nice error message.

          By generating the description in the name-node you can only add that this is coming from the NameNode. But that will be seen in the stack trace one way or another.

          > I don't think Hadoop checks for "hdfs" higher in the stack; I set fs.default.name to hdXXXXfs://doorstop.local/, and the check was triggered.

          If you get address from FileSystem, and it is not hdfs then other file system code (LocalFileSystem or kfs or S3) will be triggered.
          But when we start name-node, yes you are right, we should check the scheme. Not an assert.

          Show
          Konstantin Shvachko added a comment - > Would you prefer that I add the check to throw IllegalArgumentException in NetUtils? Yes, because it will work for other modules like JobTracker as well as the name-node. > I feel like I'd have to keep an if statement (or catch the IllegalArgumentException) in NameNode to keep the nice error message. By generating the description in the name-node you can only add that this is coming from the NameNode. But that will be seen in the stack trace one way or another. > I don't think Hadoop checks for "hdfs" higher in the stack; I set fs.default.name to hdXXXXfs://doorstop.local/, and the check was triggered. If you get address from FileSystem , and it is not hdfs then other file system code (LocalFileSystem or kfs or S3) will be triggered. But when we start name-node, yes you are right, we should check the scheme. Not an assert.
          Hide
          Philip Zeyliger added a comment -

          By generating the description in the name-node you can only add that this is coming from the NameNode. But that will be seen in the stack trace one way or another.

          This is the only place where I can reference the config property that this magically came out of.

          I've added the check for NetUtils. Uploading the latest patch.

          Show
          Philip Zeyliger added a comment - By generating the description in the name-node you can only add that this is coming from the NameNode. But that will be seen in the stack trace one way or another. This is the only place where I can reference the config property that this magically came out of. I've added the check for NetUtils. Uploading the latest patch.
          Hide
          Konstantin Shvachko added a comment -

          This looks good to me. 2 minor things.

          1. There is an empty line change in DistributedFileSystem. Could you please remove it.
          2. Could you please move the declaration of the new constant in FSConstants up above the LAYOUT_VERSION declaration.
            We want to keep it last to make it more visible when the layout version needs to be updated.

          Other than that it is ready for Hudson.

          Show
          Konstantin Shvachko added a comment - This looks good to me. 2 minor things. There is an empty line change in DistributedFileSystem . Could you please remove it. Could you please move the declaration of the new constant in FSConstants up above the LAYOUT_VERSION declaration. We want to keep it last to make it more visible when the layout version needs to be updated. Other than that it is ready for Hudson.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          This problem also occurs in other parts, e.g. HADOOP-4573.

          Show
          Tsz Wo Nicholas Sze added a comment - This problem also occurs in other parts, e.g. HADOOP-4573 .
          Hide
          Tsz Wo Nicholas Sze added a comment -

          HADOOP-4313: a similar issue.

          Show
          Tsz Wo Nicholas Sze added a comment - HADOOP-4313 : a similar issue.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          HADOOP-5140: similar problem in JobTracker

          Show
          Tsz Wo Nicholas Sze added a comment - HADOOP-5140 : similar problem in JobTracker
          Hide
          Tsz Wo Nicholas Sze added a comment -

          HADOOP-2977: another similar issue

          Show
          Tsz Wo Nicholas Sze added a comment - HADOOP-2977 : another similar issue
          Hide
          Philip Zeyliger added a comment -

          New patch with two changes Konstantin suggested.

          Show
          Philip Zeyliger added a comment - New patch with two changes Konstantin suggested.
          Hide
          Tsz Wo Nicholas Sze added a comment -
          +    if (! filesystemURI.getScheme().equals(FSConstants.HDFS_URI_SCHEME)) {
          

          I suggest changing the line above to

          +    if (!FSConstants.HDFS_URI_SCHEME.equalsIgnoreCase(filesystemURI.getScheme())) {
          

          since (1) schemes are case in-sensitive and (2) filesystemURI.getScheme() may return null.

          Show
          Tsz Wo Nicholas Sze added a comment - + if (! filesystemURI.getScheme().equals(FSConstants.HDFS_URI_SCHEME)) { I suggest changing the line above to + if (!FSConstants.HDFS_URI_SCHEME.equalsIgnoreCase(filesystemURI.getScheme())) { since (1) schemes are case in-sensitive and (2) filesystemURI.getScheme() may return null.
          Hide
          Philip Zeyliger added a comment -

          Agreed. I've modified the two places that use equals() to use equalsIgnoreCase() and have made the method be invoked on the constant.

          Show
          Philip Zeyliger added a comment - Agreed. I've modified the two places that use equals() to use equalsIgnoreCase() and have made the method be invoked on the constant.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12406153/HADOOP-5687-v5.patch
          against trunk revision 768073.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no tests are needed for this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/238/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/238/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/238/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/238/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12406153/HADOOP-5687-v5.patch against trunk revision 768073. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/238/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/238/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/238/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/238/console This message is automatically generated.
          Hide
          Philip Zeyliger added a comment -

          I think Hudson ran out of memory, and I don't see the list of failed tests...

          Total time: 244 minutes 59 seconds
          Recording test results
          FATAL: Java heap space
          java.lang.OutOfMemoryError: Java heap space
          	....
          
          Show
          Philip Zeyliger added a comment - I think Hudson ran out of memory, and I don't see the list of failed tests... Total time: 244 minutes 59 seconds Recording test results FATAL: Java heap space java.lang.OutOfMemoryError: Java heap space ....
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Resubmitting this. Hudson is working fine recently.

          Show
          Tsz Wo Nicholas Sze added a comment - Resubmitting this. Hudson is working fine recently.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12406153/HADOOP-5687-v5.patch
          against trunk revision 771661.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no tests are needed for this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/292/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/292/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/292/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/292/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12406153/HADOOP-5687-v5.patch against trunk revision 771661. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/292/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/292/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/292/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/292/console This message is automatically generated.
          Hide
          Philip Zeyliger added a comment -

          The only failure I found was:

          [0]doorstop:Downloads(73605)$grep "\[junit\] Test.*FAILED" consoleText
               [exec]     [junit] Test org.apache.hadoop.mapred.TestCapacitySchedulerConf FAILED
          

          This doesn't look related, and, indeed, running "ant -Dtestcase=TestCapacitySchedulerConf test" on a clean version of trunk triggers this error too.

          Are we ok with this going in without unit tests, or should I write something that tests that the right exception is thrown?

          Show
          Philip Zeyliger added a comment - The only failure I found was: [0]doorstop:Downloads(73605)$grep "\[junit\] Test.*FAILED" consoleText [exec] [junit] Test org.apache.hadoop.mapred.TestCapacitySchedulerConf FAILED This doesn't look related, and, indeed, running "ant -Dtestcase=TestCapacitySchedulerConf test" on a clean version of trunk triggers this error too. Are we ok with this going in without unit tests, or should I write something that tests that the right exception is thrown?
          Hide
          Konstantin Shvachko added a comment -

          The patch did not apply because the import section has been changed in SecondaryNameNode. I fixed it.

          Show
          Konstantin Shvachko added a comment - The patch did not apply because the import section has been changed in SecondaryNameNode. I fixed it.
          Hide
          Konstantin Shvachko added a comment -

          I just committed this. Thank you Philip.

          Show
          Konstantin Shvachko added a comment - I just committed this. Thank you Philip.
          Hide
          Konstantin Shvachko added a comment -

          Philip, could you please close the related issues that this patch fixes.

          Show
          Konstantin Shvachko added a comment - Philip, could you please close the related issues that this patch fixes.
          Hide
          steve_l added a comment -

          Linking to my duplicate bugrep

          1. my code adds a check for FileSystem.getDefaultUri(conf) being null, that could go in on top of what is checked in. It's unlikely, but if you try hard enough, it happens.
          1. some tests for this would be lightweight and fast. I will try to write them.
          Show
          steve_l added a comment - Linking to my duplicate bugrep my code adds a check for FileSystem.getDefaultUri(conf) being null, that could go in on top of what is checked in. It's unlikely, but if you try hard enough, it happens. some tests for this would be lightweight and fast. I will try to write them.
          Hide
          steve_l added a comment -

          I'm going to stick some more tests in -from my experiments we need to have all the methods that work out the filesystem fail in ways that are meaningful and helpful.

          1. Telling someone their URI lacks an authority because their default fs is file:// certainly finds a symptom of the problem, but misses the point that file: URIs are a no-no.
          2. HADOOP-5140 shows the problem surfaces elsewhere if the URI in the config isn't parsable. In that situation you get a fairly meaningless message. If you are lucky, the error message is vaguely meaningful -but it wont be enough to help someone trying to bring up a cluster work out what the problem is, not without them delving into the code and discovering that hadoop was looking up fs.default.name at the time
          Testcase: testHttpNoHostURI took 0.044 sec
          	Caused an ERROR
          null
          java.lang.IllegalArgumentException
          	at java.net.URI.create(URI.java:842)
          	at org.apache.hadoop.fs.FileSystem.getDefaultUri(FileSystem.java:105)
          	at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:192)
          	at org.apache.hadoop.hdfs.TestNameNodeAddress.getAddress(TestNameNodeAddress.java:54)
          	at org.apache.hadoop.hdfs.TestNameNodeAddress.testHttpNoHostURI(TestNameNodeAddress.java:102)
          Caused by: java.net.URISyntaxException: Expected authority at index 7: http://
          	at java.net.URI$Parser.fail(URI.java:2809)
          	at java.net.URI$Parser.failExpecting(URI.java:2815)
          	at java.net.URI$Parser.parseHierarchical(URI.java:3063)
          	at java.net.URI$Parser.parse(URI.java:3014)
          	at java.net.URI.<init>(URI.java:578)
          	at java.net.URI.create(URI.java:840)
          
          Testcase: testUnexpandedHdfsURI took 0.036 sec
          	Caused an ERROR
          null
          java.lang.IllegalArgumentException
          	at java.net.URI.create(URI.java:842)
          	at org.apache.hadoop.fs.FileSystem.getDefaultUri(FileSystem.java:105)
          	at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:192)
          	at org.apache.hadoop.hdfs.TestNameNodeAddress.getAddress(TestNameNodeAddress.java:54)
          	at org.apache.hadoop.hdfs.TestNameNodeAddress.testUnexpandedHdfsURI(TestNameNodeAddress.java:118)
          Caused by: java.net.URISyntaxException: Illegal character in authority at index 7: hdfs://${hostname}/
          	at java.net.URI$Parser.fail(URI.java:2809)
          	at java.net.URI$Parser.parseAuthority(URI.java:3147)
          	at java.net.URI$Parser.parseHierarchical(URI.java:3058)
          	at java.net.URI$Parser.parse(URI.java:3014)
          	at java.net.URI.<init>(URI.java:578)
          	at java.net.URI.create(URI.java:840)
          
          Show
          steve_l added a comment - I'm going to stick some more tests in -from my experiments we need to have all the methods that work out the filesystem fail in ways that are meaningful and helpful. Telling someone their URI lacks an authority because their default fs is file:// certainly finds a symptom of the problem, but misses the point that file: URIs are a no-no. HADOOP-5140 shows the problem surfaces elsewhere if the URI in the config isn't parsable. In that situation you get a fairly meaningless message. If you are lucky, the error message is vaguely meaningful -but it wont be enough to help someone trying to bring up a cluster work out what the problem is, not without them delving into the code and discovering that hadoop was looking up fs.default.name at the time Testcase: testHttpNoHostURI took 0.044 sec Caused an ERROR null java.lang.IllegalArgumentException at java.net.URI.create(URI.java:842) at org.apache.hadoop.fs.FileSystem.getDefaultUri(FileSystem.java:105) at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:192) at org.apache.hadoop.hdfs.TestNameNodeAddress.getAddress(TestNameNodeAddress.java:54) at org.apache.hadoop.hdfs.TestNameNodeAddress.testHttpNoHostURI(TestNameNodeAddress.java:102) Caused by: java.net.URISyntaxException: Expected authority at index 7: http: // at java.net.URI$Parser.fail(URI.java:2809) at java.net.URI$Parser.failExpecting(URI.java:2815) at java.net.URI$Parser.parseHierarchical(URI.java:3063) at java.net.URI$Parser.parse(URI.java:3014) at java.net.URI.<init>(URI.java:578) at java.net.URI.create(URI.java:840) Testcase: testUnexpandedHdfsURI took 0.036 sec Caused an ERROR null java.lang.IllegalArgumentException at java.net.URI.create(URI.java:842) at org.apache.hadoop.fs.FileSystem.getDefaultUri(FileSystem.java:105) at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:192) at org.apache.hadoop.hdfs.TestNameNodeAddress.getAddress(TestNameNodeAddress.java:54) at org.apache.hadoop.hdfs.TestNameNodeAddress.testUnexpandedHdfsURI(TestNameNodeAddress.java:118) Caused by: java.net.URISyntaxException: Illegal character in authority at index 7: hdfs: //${hostname}/ at java.net.URI$Parser.fail(URI.java:2809) at java.net.URI$Parser.parseAuthority(URI.java:3147) at java.net.URI$Parser.parseHierarchical(URI.java:3058) at java.net.URI$Parser.parse(URI.java:3014) at java.net.URI.<init>(URI.java:578) at java.net.URI.create(URI.java:840)
          Hide
          steve_l added a comment -

          reopening so we can get the error messages right

          Show
          steve_l added a comment - reopening so we can get the error messages right
          Hide
          steve_l added a comment -

          this adds a new test case for the method. These tests fail! Because I want to raise the issue of what the error messages should be, rather than just verify that we got the fairly meaningless java.net.URISyntaxException message nested inside something else.

          Show
          steve_l added a comment - this adds a new test case for the method. These tests fail! Because I want to raise the issue of what the error messages should be, rather than just verify that we got the fairly meaningless java.net.URISyntaxException message nested inside something else.
          Hide
          steve_l added a comment -

          This will fail, but I want everyone to see the messages are fairly useless. Half the problem is URI.create that turns the UriSyntaxException into a runtime IllegalArgumentException, which floats up without telling the user why things went wrong

          I propose that, at a minimum, FileSystem.getDefaultUri() should replace {{URI.create())} with something that catches the syntax exceptions and says that fs.default.name is wrong. If someone really has time on their hands, they could look through all uses of {{URI.create())} outside of the test code and decide if they should stay. Having had a quick run through, I dont see any that are as urgent as this one.

          Show
          steve_l added a comment - This will fail, but I want everyone to see the messages are fairly useless. Half the problem is URI.create that turns the UriSyntaxException into a runtime IllegalArgumentException, which floats up without telling the user why things went wrong I propose that, at a minimum, FileSystem.getDefaultUri() should replace {{URI.create())} with something that catches the syntax exceptions and says that fs.default.name is wrong. If someone really has time on their hands, they could look through all uses of {{URI.create())} outside of the test code and decide if they should stay. Having had a quick run through, I dont see any that are as urgent as this one.
          Hide
          steve_l added a comment -

          also, if fs.default.name=="", it gets patched to be hdfs://, which then fails later on in the parse phase for having no authority.

          I'm patching FileSystem to be more detailed, with something like

          Testcase: testEmptyURI took 0.08 sec
          	Caused an ERROR
          Unable to parse the fs.default.name value of "" patched to be "hdfs://" :Expected authority at index 7
          java.lang.IllegalArgumentException: Unable to parse the fs.default.name value of "" patched to be "hdfs://" :Expected authority at index 7
          	at org.apache.hadoop.fs.FileSystem.getDefaultUri(FileSystem.java:114)
          	at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:192)
          	at org.apache.hadoop.hdfs.TestNameNodeAddress.getAddress(TestNameNodeAddress.java:54)
          	at org.apache.hadoop.hdfs.TestNameNodeAddress.testEmptyURI(TestNameNodeAddress.java:90)
          Caused by: java.net.URISyntaxException: Expected authority at index 7: hdfs://
          	at java.net.URI$Parser.fail(URI.java:2809)
          	at java.net.URI$Parser.failExpecting(URI.java:2815)
          	at java.net.URI$Parser.parseHierarchical(URI.java:3063)
          	at java.net.URI$Parser.parse(URI.java:3014)
          	at java.net.URI.<init>(URI.java:578)
          	at org.apache.hadoop.fs.FileSystem.getDefaultUri(FileSystem.java:108)
          

          That is, it prints the patched and unpatched valies if a patching has taken place.

          But not if there is no patching

          Unable to parse the fs.default.name value of "hdfs://${hostname}/" :Illegal character in authority at index 7
          java.lang.IllegalArgumentException: Unable to parse the fs.default.name value of "hdfs://${hostname}/" :Illegal character in authority at index 7
          	at org.apache.hadoop.fs.FileSystem.getDefaultUri(FileSystem.java:114)
          	at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:192)
          	at org.apache.hadoop.hdfs.TestNameNodeAddress.getAddress(TestNameNodeAddress.java:54)
          	at org.apache.hadoop.hdfs.TestNameNodeAddress.testUnexpandedHdfsURI(TestNameNodeAddress.java:118
          

          I will file a separate bugrep against fixnames. This patch changes the reporting, not the name fixing.

          Show
          steve_l added a comment - also, if fs.default.name=="", it gets patched to be hdfs://, which then fails later on in the parse phase for having no authority. I'm patching FileSystem to be more detailed, with something like Testcase: testEmptyURI took 0.08 sec Caused an ERROR Unable to parse the fs. default .name value of "" patched to be " hdfs: //" :Expected authority at index 7 java.lang.IllegalArgumentException: Unable to parse the fs. default .name value of "" patched to be " hdfs: //" :Expected authority at index 7 at org.apache.hadoop.fs.FileSystem.getDefaultUri(FileSystem.java:114) at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:192) at org.apache.hadoop.hdfs.TestNameNodeAddress.getAddress(TestNameNodeAddress.java:54) at org.apache.hadoop.hdfs.TestNameNodeAddress.testEmptyURI(TestNameNodeAddress.java:90) Caused by: java.net.URISyntaxException: Expected authority at index 7: hdfs: // at java.net.URI$Parser.fail(URI.java:2809) at java.net.URI$Parser.failExpecting(URI.java:2815) at java.net.URI$Parser.parseHierarchical(URI.java:3063) at java.net.URI$Parser.parse(URI.java:3014) at java.net.URI.<init>(URI.java:578) at org.apache.hadoop.fs.FileSystem.getDefaultUri(FileSystem.java:108) That is, it prints the patched and unpatched valies if a patching has taken place. But not if there is no patching Unable to parse the fs. default .name value of "hdfs: //${hostname}/" :Illegal character in authority at index 7 java.lang.IllegalArgumentException: Unable to parse the fs. default .name value of "hdfs: //${hostname}/" :Illegal character in authority at index 7 at org.apache.hadoop.fs.FileSystem.getDefaultUri(FileSystem.java:114) at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:192) at org.apache.hadoop.hdfs.TestNameNodeAddress.getAddress(TestNameNodeAddress.java:54) at org.apache.hadoop.hdfs.TestNameNodeAddress.testUnexpandedHdfsURI(TestNameNodeAddress.java:118 I will file a separate bugrep against fixnames. This patch changes the reporting, not the name fixing.
          Hide
          steve_l added a comment -

          This patch
          -checks for hdfs before worrying about the authority
          -gives full diagnostics on all calls to Filesystem.getDefaultUri()
          -has lots of tests

          HADOOP-5901 is the report about the behavior of fixName() -the tests here do assert that it works as currently designed; if the odd patching is removed, these tests will need changing as the error messages will be different.

          Show
          steve_l added a comment - This patch -checks for hdfs before worrying about the authority -gives full diagnostics on all calls to Filesystem.getDefaultUri() -has lots of tests HADOOP-5901 is the report about the behavior of fixName() -the tests here do assert that it works as currently designed; if the odd patching is removed, these tests will need changing as the error messages will be different.
          Hide
          Philip Zeyliger added a comment -

          I like the patches. Quick comments:

          @see <a href="https://issues.apache.org/jira/browse/HADOOP-5687">HADOOP-5687</a> | * @see <a href="https://issues.apache.org/jira/browse/HADOOP-5687">HADOOP-5687</a>

          I was under the impression that we were discouraged from putting links to JIRA issues; is that less so the case in tests?

          return new URI(patchedfsuri);

          I think the indentation here is wrong.

            public TestNameNodeAddress(String s) {
              super(s);
            }
          
            @Override
            protected void setUp() throws Exception {
            }
          

          These are default, right? You can skip them.

                if (!e.toString().contains(text)) {
                  throw e;
                }
          

          You could do "assertTrue(e.toString().contains(text), "Expected foo but got bar.")" instead of re-throwing. Both end up failing the test, but the assertion failure makes it clearer that this isn't a random hard-to-diagnose exception.

          [0]doorstop5687:hadoop-git(93063)$cat build/test/TEST-org.apache.hadoop.hdfs.TestNameNodeAddress.txt  | grep -B 1 ERROR
          Testcase: testNullURI took 0.142 sec
          	Caused an ERROR
          --
          Testcase: testEmptyURI took 0.095 sec
          	Caused an ERROR
          --
          Testcase: testFileURI took 0.044 sec
          	Caused an ERROR
          --
          Testcase: testLocalMapsToFileURI took 0.037 sec
          	Caused an ERROR
          --
          Testcase: testHttpNoHostURI took 0.048 sec
          	Caused an ERROR
          --
          Testcase: testUnexpandedHdfsURI took 0.094 sec
          	Caused an ERROR
          

          I think some of the tests test that some exceptions are supposed to be thrown but aren't marked as such.

          Show
          Philip Zeyliger added a comment - I like the patches. Quick comments: @see <a href="https://issues.apache.org/jira/browse/HADOOP-5687"> HADOOP-5687 </a> | * @see <a href="https://issues.apache.org/jira/browse/HADOOP-5687"> HADOOP-5687 </a> I was under the impression that we were discouraged from putting links to JIRA issues; is that less so the case in tests? return new URI(patchedfsuri); I think the indentation here is wrong. public TestNameNodeAddress(String s) { super(s); } @Override protected void setUp() throws Exception { } These are default, right? You can skip them. if (!e.toString().contains(text)) { throw e; } You could do "assertTrue(e.toString().contains(text), "Expected foo but got bar.")" instead of re-throwing. Both end up failing the test, but the assertion failure makes it clearer that this isn't a random hard-to-diagnose exception. [0]doorstop5687:hadoop-git(93063)$cat build/test/TEST-org.apache.hadoop.hdfs.TestNameNodeAddress.txt | grep -B 1 ERROR Testcase: testNullURI took 0.142 sec Caused an ERROR -- Testcase: testEmptyURI took 0.095 sec Caused an ERROR -- Testcase: testFileURI took 0.044 sec Caused an ERROR -- Testcase: testLocalMapsToFileURI took 0.037 sec Caused an ERROR -- Testcase: testHttpNoHostURI took 0.048 sec Caused an ERROR -- Testcase: testUnexpandedHdfsURI took 0.094 sec Caused an ERROR I think some of the tests test that some exceptions are supposed to be thrown but aren't marked as such.
          Hide
          steve_l added a comment -

          Phil, are you looking at the -1.patch or the -2.patch of mine. The #1 patch certainly threw errors, as I was looking at the messages, but I thought I'd cleaned that and the empty setup() out

          Show
          steve_l added a comment - Phil, are you looking at the -1.patch or the -2.patch of mine. The #1 patch certainly threw errors, as I was looking at the messages, but I thought I'd cleaned that and the empty setup() out
          Hide
          Philip Zeyliger added a comment -

          You're right, the -2.patch doesn't fail tests.

          Show
          Philip Zeyliger added a comment - You're right, the -2.patch doesn't fail tests.
          Hide
          Konstantin Shvachko added a comment -

          Steve, I see you are submitting patches here to current trunk, which includes the patch committed in this issue.
          This is not how it should work.
          If you want to submit a patch to the current code you should open a new issue rather than re-opening the one that was committed.
          Each commit should correspond to only one issue.
          Please submit your patch in a new issue or in HADOOP-5200.
          I am closing this again unless you think the original patch should be reverted.

          Show
          Konstantin Shvachko added a comment - Steve, I see you are submitting patches here to current trunk, which includes the patch committed in this issue. This is not how it should work. If you want to submit a patch to the current code you should open a new issue rather than re-opening the one that was committed. Each commit should correspond to only one issue. Please submit your patch in a new issue or in HADOOP-5200 . I am closing this again unless you think the original patch should be reverted.
          Hide
          steve_l added a comment -

          well, I was just adding the tests to go with the patch...it just became a bit of a rework of the original patch. I wil open a new one.

          Show
          steve_l added a comment - well, I was just adding the tests to go with the patch...it just became a bit of a rework of the original patch. I wil open a new one.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12408840/HADOOP-5687-extension-2.patch
          against trunk revision 778388.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/402/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12408840/HADOOP-5687-extension-2.patch against trunk revision 778388. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/402/console This message is automatically generated.

            People

            • Assignee:
              Philip Zeyliger
              Reporter:
              Philip Zeyliger
            • Votes:
              1 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development