Hadoop Common
  1. Hadoop Common
  2. HADOOP-7602

wordcount, sort etc on har files fails with NPE

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.20.205.0
    • Fix Version/s: 0.20.205.0
    • Component/s: None
    • Labels:
      None

      Description

      wordcount, sort etc on har files fails with NPE@createSocketAddr(NetUtils.java:137).

      1. HADOOP-7602.patch
        2 kB
        John George
      2. hadoop-7602.trunk.2.patch
        2 kB
        John George
      3. hadoop-7602.trunk.1.patch
        3 kB
        John George
      4. hadoop-7602.trunk.patch
        3 kB
        John George
      5. hadoop-7602.trunk.patch
        3 kB
        John George
      6. hadoop-7602.9.patch
        9 kB
        John George
      7. hadoop-7602.8.patch
        8 kB
        John George
      8. hadoop-7602.7.patch
        8 kB
        John George
      9. hadoop-7602.6.patch
        9 kB
        John George
      10. hadoop-7602.4.patch
        9 kB
        John George
      11. hadoop-7602.3.patch
        4 kB
        John George
      12. hadoop-7602.2.patch
        2 kB
        John George
      13. hadoop-7602.daryns_comment.2.patch
        3 kB
        John George
      14. hadoop-7602.daryns_comment.patch
        3 kB
        John George
      15. hadoop-7602.1.patch
        2 kB
        John George
      16. hadoop-7602.patch
        0.7 kB
        John George

        Issue Links

          Activity

          Hide
          Matt Foley added a comment -

          Closed upon release of 0.20.205.0

          Show
          Matt Foley added a comment - Closed upon release of 0.20.205.0
          Hide
          Matt Foley added a comment -

          If trunk versions are no longer target for this jira, it can be resolved.

          Show
          Matt Foley added a comment - If trunk versions are no longer target for this jira, it can be resolved.
          Hide
          John George added a comment -

          Removing 0.23 and 0.24 from target versions since MAPREDUCE-3149 is tracking the trunk version for this bug.

          Show
          John George added a comment - Removing 0.23 and 0.24 from target versions since MAPREDUCE-3149 is tracking the trunk version for this bug.
          Hide
          John George added a comment -

          Created MAPREDUCE-3149 and added that as a blocker for this JIRA. I could not make it a sub-task since it was part of a different project (mapreduce project)

          Show
          John George added a comment - Created MAPREDUCE-3149 and added that as a blocker for this JIRA. I could not make it a sub-task since it was part of a different project (mapreduce project)
          Hide
          Matt Foley added a comment -

          It's failing because the Jira is against HADOOP, but the patch is against a MAPREDUCE file. Unfortunately, despite the "unsplit", our build tools still can't handle this.

          For a bug which was only a trunk issue, the solution would be to "move" the jira to MAPREDUCE. But since this started as a 0.20 issue, and the 0.20 patch is actually cross-component, that doesn't seem like the right solution. I recommend instead opening a sub-task jira in MAPREDUCE for the port to trunk, and referencing HADOOP-7602 as well as the sub-task bug number in the commit message, so the commit shows up here.

          Show
          Matt Foley added a comment - It's failing because the Jira is against HADOOP, but the patch is against a MAPREDUCE file. Unfortunately, despite the "unsplit", our build tools still can't handle this. For a bug which was only a trunk issue, the solution would be to "move" the jira to MAPREDUCE. But since this started as a 0.20 issue, and the 0.20 patch is actually cross-component, that doesn't seem like the right solution. I recommend instead opening a sub-task jira in MAPREDUCE for the port to trunk, and referencing HADOOP-7602 as well as the sub-task bug number in the commit message, so the commit shows up here.
          Hide
          Ravi Prakash added a comment -

          Patch applied successfully to my dev environment too. Is something wrong with Jenkins? Pretty small patch. I don't see why it should fail.

          Show
          Ravi Prakash added a comment - Patch applied successfully to my dev environment too. Is something wrong with Jenkins? Pretty small patch. I don't see why it should fail.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12497911/HADOOP-7602.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/269//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12497911/HADOOP-7602.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/269//console This message is automatically generated.
          Hide
          John George added a comment -

          The same patch seems to apply to the trunk view on my desktop... Trying it once again.

          Show
          John George added a comment - The same patch seems to apply to the trunk view on my desktop... Trying it once again.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12497682/hadoop-7602.trunk.2.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/264//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12497682/hadoop-7602.trunk.2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/264//console This message is automatically generated.
          Hide
          John George added a comment -

          Based on the discussion from HADOOP-7661, this JIRA is not applicable to trunk (since MAPREDUCE-2780 will fix it differently in trunk). Herewith I am attaching the tests that were written for this JIRA to be ported to trunk.

          Show
          John George added a comment - Based on the discussion from HADOOP-7661 , this JIRA is not applicable to trunk (since MAPREDUCE-2780 will fix it differently in trunk). Herewith I am attaching the tests that were written for this JIRA to be ported to trunk.
          Hide
          Matt Foley added a comment -

          Hi John, please bring the trunk patch up to snuff so it will pass test-patch. Thanks.

          Show
          Matt Foley added a comment - Hi John, please bring the trunk patch up to snuff so it will pass test-patch. Thanks.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12494703/hadoop-7602.trunk.1.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/196//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12494703/hadoop-7602.trunk.1.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/196//console This message is automatically generated.
          Hide
          John George added a comment -

          -1 overall.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          -1 release audit. The applied patch generated 9 release audit warnings (more than the trunk's current 0 warnings).

          Show
          John George added a comment - -1 overall. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. -1 release audit. The applied patch generated 9 release audit warnings (more than the trunk's current 0 warnings).
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12494683/hadoop-7602.trunk.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/194//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12494683/hadoop-7602.trunk.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/194//console This message is automatically generated.
          Hide
          John George added a comment -

          uploading the same trunk patch again for test-patch to go through.

          Show
          John George added a comment - uploading the same trunk patch again for test-patch to go through.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12494679/hadoop-7602.trunk.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/193//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12494679/hadoop-7602.trunk.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/193//console This message is automatically generated.
          Hide
          John George added a comment -

          Thanks Jitendra, Daryn, Nathan for reviewing the 20-security patch. Just uploaded the trunk version of the patch.

          Show
          John George added a comment - Thanks Jitendra, Daryn, Nathan for reviewing the 20-security patch. Just uploaded the trunk version of the patch.
          Hide
          John George added a comment -

          patch for trunk

          Show
          John George added a comment - patch for trunk
          Hide
          Jitendra Nath Pandey added a comment -

          Committed to 205, and 20-security. This needs a fix in trunk as well.

          Show
          Jitendra Nath Pandey added a comment - Committed to 205, and 20-security. This needs a fix in trunk as well.
          Hide
          John George added a comment -

          < +1 Looks great! I'm not sure that I see the previously mentioned variations of har uris being tested. Are they implicitly tested elsewhere?
          Yes, har://hdfs-host:port/path is being tests in TestHadoopArchives.java (which is included with this patch).

          Show
          John George added a comment - < +1 Looks great! I'm not sure that I see the previously mentioned variations of har uris being tested. Are they implicitly tested elsewhere? Yes, har://hdfs-host:port/path is being tests in TestHadoopArchives.java (which is included with this patch).
          Hide
          John George added a comment -

          HADOOP-7638 filed.

          Show
          John George added a comment - HADOOP-7638 filed.
          Hide
          Nathan Roberts added a comment -

          +1 for current patch.

          As far as the two options, I'd prefer throwing the IllegalArgumentException instead of tripping the NPE. This seems cleaner than an NPE and allows us to catch errors during the setup of the service rather than limping along with a bogus service name.
          The current patch doesn't do this but I'm ok with doing that in a separate jira since it's not required to fix the problem with HAR.

          I think it would be a good idea to file an additional jira to take another look at the visibility of the security utils and things like getCanonicalService. Doesn't seem like these should be fully public. Seems like a good activity for trunk.

          Show
          Nathan Roberts added a comment - +1 for current patch. As far as the two options, I'd prefer throwing the IllegalArgumentException instead of tripping the NPE. This seems cleaner than an NPE and allows us to catch errors during the setup of the service rather than limping along with a bogus service name. The current patch doesn't do this but I'm ok with doing that in a separate jira since it's not required to fix the problem with HAR. I think it would be a good idea to file an additional jira to take another look at the visibility of the security utils and things like getCanonicalService. Doesn't seem like these should be fully public. Seems like a good activity for trunk.
          Hide
          Jitendra Nath Pandey added a comment -

          +1 for the current patch.
          Some usage might be calling getCanonicalService and assuming non null value in return. But as it is, this method is not returning anything useful for HARFileSystem. Therefore it is not a big risk to return null for this method.

          Summary of a discussion with Nathan:
          SecurityUtil APIs should be fixed. Throwing a checked exception might cause backward compatibility issues and returning an empty string may have an issue that some user code may be parsing the service assuming it is ip:port. Although the API doesn't gaurantee a specific service format but in hadoop it has been assumed to be "ip:port"
          So we have two options:
          1) A very conservative option to return ":0", which is the current behavior in certain cases of invalid input.
          2) Throw IllegalArgumentException which is better than NPE and is semantically clearer.

          Show
          Jitendra Nath Pandey added a comment - +1 for the current patch. Some usage might be calling getCanonicalService and assuming non null value in return. But as it is, this method is not returning anything useful for HARFileSystem. Therefore it is not a big risk to return null for this method. Summary of a discussion with Nathan: SecurityUtil APIs should be fixed. Throwing a checked exception might cause backward compatibility issues and returning an empty string may have an issue that some user code may be parsing the service assuming it is ip:port. Although the API doesn't gaurantee a specific service format but in hadoop it has been assumed to be "ip:port" So we have two options: 1) A very conservative option to return ":0", which is the current behavior in certain cases of invalid input. 2) Throw IllegalArgumentException which is better than NPE and is semantically clearer.
          Hide
          Daryn Sharp added a comment -

          +1 Looks great! I'm not sure that I see the previously mentioned variations of har uris being tested. Are they implicitly tested elsewhere?

          Show
          Daryn Sharp added a comment - +1 Looks great! I'm not sure that I see the previously mentioned variations of har uris being tested. Are they implicitly tested elsewhere?
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12494510/hadoop-7602.9.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 6 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/183//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12494510/hadoop-7602.9.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/183//console This message is automatically generated.
          Hide
          Jitendra Nath Pandey added a comment -

          I am ok with throwing exceptions in SecurityUtil methods instead of empty strings. One caveat is that these are public methods, changing signature could cause incompatibility.
          I believed that canonical service name could be a more generic concept used in many contexts, therefore I have been reluctant to tie it down to only tokens. But as of now canonical service name is used only for token service currently, therefore returning null if token not supported should be ok. But again this is a public API, returning null is a semantic difference and may cause issues in applications already using this method and expecting a string back in all cases.
          It may be unlikely that these methods are actually being used by any application, but these are part of public APIs so the compatibility point needs to be thought through.

          Show
          Jitendra Nath Pandey added a comment - I am ok with throwing exceptions in SecurityUtil methods instead of empty strings. One caveat is that these are public methods, changing signature could cause incompatibility. I believed that canonical service name could be a more generic concept used in many contexts, therefore I have been reluctant to tie it down to only tokens. But as of now canonical service name is used only for token service currently, therefore returning null if token not supported should be ok. But again this is a public API, returning null is a semantic difference and may cause issues in applications already using this method and expecting a string back in all cases. It may be unlikely that these methods are actually being used by any application, but these are part of public APIs so the compatibility point needs to be thought through.
          Hide
          Nathan Roberts added a comment -

          Yes. I like the idea of only changing getCanonicalServiceName within the har fs and then have the TokenCache skip that filesystem when gathering tokens. This seems like the least risky change. We can address the fact that buildDTServiceName needs a better way than NPE to indicate there is a problem in the ip->host jira or another one altogether. If someone besides HAR is going to run into the NPE, it's clear they're doing something wrong so it wouldn't necessarily be a bad thing to track them down even if it's via NPE.

          Show
          Nathan Roberts added a comment - Yes. I like the idea of only changing getCanonicalServiceName within the har fs and then have the TokenCache skip that filesystem when gathering tokens. This seems like the least risky change. We can address the fact that buildDTServiceName needs a better way than NPE to indicate there is a problem in the ip->host jira or another one altogether. If someone besides HAR is going to run into the NPE, it's clear they're doing something wrong so it wouldn't necessarily be a bad thing to track them down even if it's via NPE.
          Hide
          Daryn Sharp added a comment -

          It appears that pre-205 behavior for getCanonicalServiceName was to return ":0" if the fs uri didn't contain an authority, ie. "har:///my.har", "path/my.har", etc. Otherwise if it had an authority, it would try to convert the host to an ip and return it. If that failed, just return the unresolved host.

          Nathan and I discussed, and here's our recommendation:

          The real problem is that a filesystem has no means of signifying that it does not have tokens. Logically, a filesystem with no tokens should not have a service name for its non-existent token. Perhaps getCanonicalServiceName should return null if the filesystem does not support tokens. The TokenCache can skip that filesystem while getting tokens. Null avoids a conflict with any valid credentials key, including empty string.

          The solution will be similar to John's earlier patch. Instead of har returning an empty string service, it will return null. The token cache will skip filesystems that return a null service. The filesystem base class will return null if the uri has no authority. I think this covers all use cases with <10 LOC.

          The NPE in the token method will be changed into a more reasonable exception in the ip->host change. However, these exceptions won't be a problem if har returns a null service, instead of calling the methods with bad input that triggers the exception.

          Thoughts? If this is not acceptable, what are the sticking points?

          Show
          Daryn Sharp added a comment - It appears that pre-205 behavior for getCanonicalServiceName was to return ":0" if the fs uri didn't contain an authority, ie. "har:///my.har", "path/my.har", etc. Otherwise if it had an authority, it would try to convert the host to an ip and return it. If that failed, just return the unresolved host. Nathan and I discussed, and here's our recommendation: The real problem is that a filesystem has no means of signifying that it does not have tokens. Logically, a filesystem with no tokens should not have a service name for its non-existent token. Perhaps getCanonicalServiceName should return null if the filesystem does not support tokens. The TokenCache can skip that filesystem while getting tokens. Null avoids a conflict with any valid credentials key, including empty string. The solution will be similar to John's earlier patch. Instead of har returning an empty string service, it will return null. The token cache will skip filesystems that return a null service. The filesystem base class will return null if the uri has no authority. I think this covers all use cases with <10 LOC. The NPE in the token method will be changed into a more reasonable exception in the ip->host change. However, these exceptions won't be a problem if har returns a null service, instead of calling the methods with bad input that triggers the exception. Thoughts? If this is not acceptable, what are the sticking points?
          Hide
          Daryn Sharp added a comment -

          Yes. Had a lengthy IM, here's what I believe to be the summary:

          • We agree that something better than a NPE should thrown on bad input.
            • We disagree on whether it should return empty string (Jitendra), or throw a more descriptive exception (me).
            • We disagree that removing the exception invites other problems for all other callers of the method.
            • We disagree that the NPE is better addressed in another bug. Note: I have fixed this on the service ip->host.
          • I think we disagree that har should not be calling buildDTServiceName/buildDTAuthority.
            • I maintain it should not be called because it's building a Credentials key for a token. Har does not have tokens so the Credentials key is of no value.
            • Har should just directly return a key instead of passing junk to a method that should always result in an exception.

          I believe the simple solution is to fix the har filesystem with a few lines. John, Nathan, and I will review the code and post recommendations.

          Show
          Daryn Sharp added a comment - Yes. Had a lengthy IM, here's what I believe to be the summary: We agree that something better than a NPE should thrown on bad input. We disagree on whether it should return empty string (Jitendra), or throw a more descriptive exception (me). We disagree that removing the exception invites other problems for all other callers of the method. We disagree that the NPE is better addressed in another bug. Note: I have fixed this on the service ip->host. I think we disagree that har should not be calling buildDTServiceName/buildDTAuthority. I maintain it should not be called because it's building a Credentials key for a token. Har does not have tokens so the Credentials key is of no value. Har should just directly return a key instead of passing junk to a method that should always result in an exception. I believe the simple solution is to fix the har filesystem with a few lines. John, Nathan, and I will review the code and post recommendations.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12494334/hadoop-7602.8.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 6 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/180//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12494334/hadoop-7602.8.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/180//console This message is automatically generated.
          Hide
          John George added a comment -

          removing System.out.println and LOG4j declarations from test file....
          and attaching another patch for completeness.

          Show
          John George added a comment - removing System.out.println and LOG4j declarations from test file.... and attaching another patch for completeness.
          Hide
          Suresh Srinivas added a comment -

          -1 buildDTAuthority absolutely cannot return an empty string in any other context w/o wrecking havoc on token processing.

          Daryn, after Jitendra's explanation, are you still -1?

          Show
          Suresh Srinivas added a comment - -1 buildDTAuthority absolutely cannot return an empty string in any other context w/o wrecking havoc on token processing. Daryn, after Jitendra's explanation, are you still -1?
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12494311/hadoop-7602.7.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 6 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/177//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12494311/hadoop-7602.7.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/177//console This message is automatically generated.
          Hide
          Jitendra Nath Pandey added a comment -

          > -1 buildDTAuthority absolutely cannot return an empty string in any other context w/o wrecking havoc on token processing.
          John's patch is returning empty string only for cases where it would otherwise throw an NPE. Empty string is being thrown only if authority is not a valid host, in that case it will not cause any problem because no connection or tokens would work for such hosts anyway.

          I look at this jira solving an immediate problem that NPEs are being thrown if authority is invalid or hosts are not resolved.
          There may be alternative approaches to handle the bad inputs to SecurityUtil methods in more elegant ways, but that can be taken outside this jira.

          Show
          Jitendra Nath Pandey added a comment - > -1 buildDTAuthority absolutely cannot return an empty string in any other context w/o wrecking havoc on token processing. John's patch is returning empty string only for cases where it would otherwise throw an NPE. Empty string is being thrown only if authority is not a valid host, in that case it will not cause any problem because no connection or tokens would work for such hosts anyway. I look at this jira solving an immediate problem that NPEs are being thrown if authority is invalid or hosts are not resolved. There may be alternative approaches to handle the bad inputs to SecurityUtil methods in more elegant ways, but that can be taken outside this jira.
          Hide
          Jitendra Nath Pandey added a comment -

          comments on John's latest patch:
          1) No change to TokenCache is required. It should work alright as long as we are not getting an NPE.
          2) Instead of checking for addr.isUnresolved, we should just check for null, because that is what we want to guard against in this jira.

          Show
          Jitendra Nath Pandey added a comment - comments on John's latest patch: 1) No change to TokenCache is required. It should work alright as long as we are not getting an NPE. 2) Instead of checking for addr.isUnresolved, we should just check for null, because that is what we want to guard against in this jira.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12494298/hadoop-7602.6.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 6 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/173//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12494298/hadoop-7602.6.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/173//console This message is automatically generated.
          Hide
          Daryn Sharp added a comment -

          -1 buildDTAuthority absolutely cannot return an empty string in any other context w/o wrecking havoc on token processing.

          The problem is not these methods. The problem is that FileSystem.getCanonicalServiceName is passing bad input. The problem should be fixed there.

          Show
          Daryn Sharp added a comment - -1 buildDTAuthority absolutely cannot return an empty string in any other context w/o wrecking havoc on token processing. The problem is not these methods. The problem is that FileSystem.getCanonicalServiceName is passing bad input. The problem should be fixed there.
          Hide
          John George added a comment -

          patch based on Jitendra's comments.

          Show
          John George added a comment - patch based on Jitendra's comments.
          Hide
          Daryn Sharp added a comment -

          Another possibility is to push the returning of an empty string for invalid input down into FileSystem.getCanonicalService.

          Show
          Daryn Sharp added a comment - Another possibility is to push the returning of an empty string for invalid input down into FileSystem.getCanonicalService .
          Hide
          Daryn Sharp added a comment -

          I don't agree with the re-introducing buggy behavior to low-level token methods to account for a bad interaction between the TokenCache and HarFileSystem. The token methods are called in many other places that cannot tolerate an empty string. Allowing an empty string to be returned for invalid input in any other context will result in onerous exceptions like those being fixed, unusable tokens, or silent token renewal failures.

          The root issue is that TokenCache.obtainTokensForNamenodes calls getCanonicalServiceName to use as the key for the credentials map. getCanonicalServiceName is expected to return the "ip:port" authority. The HarFileSystem has no or an invalid authority, because it has no tokens, so it relied on the low-level token methods to return an empty string on bad input. It should never have gotten that far, so John made it return an empty string immediately instead of relying on bad behavior in the low-level token methods.

          In effect, har now behaves exactly the same way as it does in 204 w/o relying on bugs. Excuse me for repeating: Allowing an empty string to be returned for invalid input in any other context will result in onerous exceptions like those being fixed, unusable tokens, or silent token renewal failures.

          Show
          Daryn Sharp added a comment - I don't agree with the re-introducing buggy behavior to low-level token methods to account for a bad interaction between the TokenCache and HarFileSystem . The token methods are called in many other places that cannot tolerate an empty string. Allowing an empty string to be returned for invalid input in any other context will result in onerous exceptions like those being fixed, unusable tokens, or silent token renewal failures. The root issue is that TokenCache.obtainTokensForNamenodes calls getCanonicalServiceName to use as the key for the credentials map. getCanonicalServiceName is expected to return the "ip:port" authority. The HarFileSystem has no or an invalid authority, because it has no tokens, so it relied on the low-level token methods to return an empty string on bad input. It should never have gotten that far, so John made it return an empty string immediately instead of relying on bad behavior in the low-level token methods. In effect, har now behaves exactly the same way as it does in 204 w/o relying on bugs. Excuse me for repeating: Allowing an empty string to be returned for invalid input in any other context will result in onerous exceptions like those being fixed, unusable tokens, or silent token renewal failures.
          Hide
          Jitendra Nath Pandey added a comment -

          Comparing the latest patch with the earliest patch that John had, I am more inclined towards the earlier approach. The point is that the methods buildDTService and buildDTAuthority in SecurityUtil are public utility methods but are not handling the invalid input well. In some cases they do return empty string host, but not consistently. I think better approach is to make that behavior consistent. In most common case when these methods are called with an invalid input, the caller code doesn't really care about what they return. For example, calling buildDTService for har file system.

          Show
          Jitendra Nath Pandey added a comment - Comparing the latest patch with the earliest patch that John had, I am more inclined towards the earlier approach. The point is that the methods buildDTService and buildDTAuthority in SecurityUtil are public utility methods but are not handling the invalid input well. In some cases they do return empty string host, but not consistently. I think better approach is to make that behavior consistent. In most common case when these methods are called with an invalid input, the caller code doesn't really care about what they return. For example, calling buildDTService for har file system.
          Hide
          Daryn Sharp added a comment -

          +1 Code change is minimal and looks great esp. since it mimics what TokenCache expected w/o impacting the lower level methods.

          Unless I'm misreading the tests, it would be nice to test the TokenCache with all the uris "har:/path", "har:///path", and "har:///authority/path".

          Show
          Daryn Sharp added a comment - +1 Code change is minimal and looks great esp. since it mimics what TokenCache expected w/o impacting the lower level methods. Unless I'm misreading the tests, it would be nice to test the TokenCache with all the uris "har:/path", "har:///path", and "har:///authority/path".
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12494251/hadoop-7602.4.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 6 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/170//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12494251/hadoop-7602.4.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/170//console This message is automatically generated.
          Hide
          John George added a comment -

          Uploading a patch that does not conflict with hadoop-7510

          Show
          John George added a comment - Uploading a patch that does not conflict with hadoop-7510
          Hide
          Daryn Sharp added a comment -

          Yes, the patch will be a problem for 7510. As an aside, I believe the null check for addr.getHostName() is a no-op because getHostName should never return null. It will return the unresolvable hostname.

          Reverting SecurityUtil method in the original patch to return an empty string with the authority is null will re-introduce a bug. I don't remember the exact details, but I encountered a case (in hftp?) where a token was being created with an empty service and thus useless.

          More importantly, I don't see how reverting will fix this problem. Reverting will let har:///path work, but I think it's still going to fail on har://hdfs-something/path because the fake hostname will not resolve. addr.getAddress().getHostAddress() will blow with a NPE when getAddress() returns null for the unresolvable host.

          Show
          Daryn Sharp added a comment - Yes, the patch will be a problem for 7510. As an aside, I believe the null check for addr.getHostName() is a no-op because getHostName should never return null. It will return the unresolvable hostname. Reverting SecurityUtil method in the original patch to return an empty string with the authority is null will re-introduce a bug. I don't remember the exact details, but I encountered a case (in hftp?) where a token was being created with an empty service and thus useless. More importantly, I don't see how reverting will fix this problem. Reverting will let har:///path work, but I think it's still going to fail on har://hdfs-something/path because the fake hostname will not resolve. addr.getAddress().getHostAddress() will blow with a NPE when getAddress() returns null for the unresolvable host.
          Hide
          Jitendra Nath Pandey added a comment -

          The change in buildDTAuthority(InetSocketAddress addr) in SecurityUtil:

          - String host= addr.getAddress().getHostAddress();
          + String host = addr.getHostName();
          + if (host == null) {
          + host = addr.getAddress().getHostAddress();
          + }

          This will cause to return hostname instead of ip in usual case. HADOOP-7510 is modifying that behavior, this patch should not change it.

          Show
          Jitendra Nath Pandey added a comment - The change in buildDTAuthority(InetSocketAddress addr) in SecurityUtil: - String host= addr.getAddress().getHostAddress(); + String host = addr.getHostName(); + if (host == null) { + host = addr.getAddress().getHostAddress(); + } This will cause to return hostname instead of ip in usual case. HADOOP-7510 is modifying that behavior, this patch should not change it.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12493893/hadoop-7602.3.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/156//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12493893/hadoop-7602.3.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/156//console This message is automatically generated.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12493893/hadoop-7602.3.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/155//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12493893/hadoop-7602.3.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/155//console This message is automatically generated.
          Hide
          John George added a comment -

          Attaching a patch upgraded to the latest 205 with tests etc...

          Show
          John George added a comment - Attaching a patch upgraded to the latest 205 with tests etc...
          Hide
          John George added a comment -

          Patch based on the latest 205 branch

          Show
          John George added a comment - Patch based on the latest 205 branch
          Hide
          Daryn Sharp added a comment -

          +1 Looks great! Looks like the patch failed due to an issue with build server.

          Show
          Daryn Sharp added a comment - +1 Looks great! Looks like the patch failed due to an issue with build server.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12492856/hadoop-7602.daryns_comment.2.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/129//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12492856/hadoop-7602.daryns_comment.2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/129//console This message is automatically generated.
          Hide
          John George added a comment -

          Attaching a new patch with the test changes that Daryn suggested.
          Ran ant test on this patch and no tests that were passing in trunk failed on this patch either.

          Show
          John George added a comment - Attaching a new patch with the test changes that Daryn suggested. Ran ant test on this patch and no tests that were passing in trunk failed on this patch either.
          Hide
          Daryn Sharp added a comment -

          Looks good, sans the small detail that the test won't fail if the IllegalArgumentException isn't thrown. It catches it and considers it a pass, but doesn't fail if it's not thrown. There's also a few commented out lines in the test.

          Show
          Daryn Sharp added a comment - Looks good, sans the small detail that the test won't fail if the IllegalArgumentException isn't thrown. It catches it and considers it a pass, but doesn't fail if it's not thrown. There's also a few commented out lines in the test.
          Hide
          John George added a comment -

          Attaching a patch with changes suggested by Daryn. This way, it will be easier to checkout the changes if anyone else wants to.

          test-patch on the changes:
          [exec]
          [exec] BUILD SUCCESSFUL
          [exec] Total time: 6 minutes 47 seconds
          [exec]
          [exec]
          [exec]
          [exec]
          [exec] +1 overall.
          [exec]
          [exec] +1 @author. The patch does not contain any @author tags.
          [exec]
          [exec] +1 tests included. The patch appears to include 3 new or modified tests.
          [exec]
          [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
          [exec]
          [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
          [exec]
          [exec] +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.
          [exec]
          [exec]
          [exec]
          [exec]
          [exec] ======================================================================
          [exec] ======================================================================
          [exec] Finished build.
          [exec] ======================================================================
          [exec] ======================================================================
          [exec]
          [exec]

          BUILD SUCCESSFUL

          Show
          John George added a comment - Attaching a patch with changes suggested by Daryn. This way, it will be easier to checkout the changes if anyone else wants to. test-patch on the changes: [exec] [exec] BUILD SUCCESSFUL [exec] Total time: 6 minutes 47 seconds [exec] [exec] [exec] [exec] [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. [exec] [exec] [exec] [exec] [exec] ====================================================================== [exec] ====================================================================== [exec] Finished build. [exec] ====================================================================== [exec] ====================================================================== [exec] [exec] BUILD SUCCESSFUL
          Hide
          John George added a comment -

          >
          > If buildDTServiceAuthority ever returns an empty string, then the token is
          > rendered useless. The logs will show that a token was acquired, but
          > connections will mysteriously fail because an empty string doesn't match the
          > address of the connection.

          The logs are buggy then?

          >
          > The 1-line fix for the real problem is equally as minimal as reintroducing a
          > bug to mask another bug.
          >
          What you are saying here "might" be true, but my point is that in a sustaining release, it is better to minimize changes to known issues (as opposed to fixing everything around it) - just because it is safer that way.

          Having said that, I will still upload another patch with the changes you suggested

          Show
          John George added a comment - > > If buildDTServiceAuthority ever returns an empty string, then the token is > rendered useless. The logs will show that a token was acquired, but > connections will mysteriously fail because an empty string doesn't match the > address of the connection. The logs are buggy then? > > The 1-line fix for the real problem is equally as minimal as reintroducing a > bug to mask another bug. > What you are saying here "might" be true, but my point is that in a sustaining release, it is better to minimize changes to known issues (as opposed to fixing everything around it) - just because it is safer that way. Having said that, I will still upload another patch with the changes you suggested
          Hide
          Daryn Sharp added a comment -

          The method build(D)elegation(T)okenServiceAuthority is specifically meant for use by tokens. The client uses buildDTServiceAuthority to set the service in a token. The service is what the RPC layer matches against to find a token for new connections.

          If buildDTServiceAuthority ever returns an empty string, then the token is rendered useless. The logs will show that a token was acquired, but connections will mysteriously fail because an empty string doesn't match the address of the connection.

          The 1-line fix for the real problem is equally as minimal as reintroducing a bug to mask another bug.

          Show
          Daryn Sharp added a comment - The method build(D)elegation(T)okenServiceAuthority is specifically meant for use by tokens. The client uses buildDTServiceAuthority to set the service in a token. The service is what the RPC layer matches against to find a token for new connections. If buildDTServiceAuthority ever returns an empty string, then the token is rendered useless. The logs will show that a token was acquired, but connections will mysteriously fail because an empty string doesn't match the address of the connection. The 1-line fix for the real problem is equally as minimal as reintroducing a bug to mask another bug.
          Hide
          John George added a comment -

          >
          > Regarding tests, no matter how this is fixed, you should have one that calls
          > TokenCache.obtainTokensForNamenodes with a non-server based authority uri
          > (ie. a har path) since that's what is breaking.
          To me (based on the two patches uploaded until now), this is a unit test that specifically tests whether buildDTServiceName() is dealing with non-server based authority uri properly or not - because that is specifically where the NPE was. To test obtainTokensForNamenodes() or any function above would be good to have, but does not look like it is specific to the patch that is submitted herewith. But, if the changes you suggested are made, then it does makes sense to unit test ObjectTokensForNamenodes as well.

          >
          > I consider the pre MR-2780 behavior of buildDTServiceAuthority is a bug.
          > The method is not documented as returning an empty string when given bad
          > input.
          Are you saying everything that is not documented is a bug? On a serious note, when you say the "pre MR-2780" behavior is a bug - are you saying there were issues that the users of buildDTServiceAuthority ran into because it was returning a string that was not "host:port"?
          Or are you saying it could have had issues? If so, what issues?

          >
          > The TokenCache should stop erroneously calling it. However, even if the
          > TokenCache isn't going to erroneously call the method anymore, I feel
          > there is still value in having buildDTServiceAuthority throw on invalid
          > input instead of returning an invalid value. Otherwise, an invalid return
          > value will only cause an error to crop up someplace else.
          What you are stating above definitely makes sense in trunk - but in a sustaining release it seems better to make minimal changes to fix bugs?

          Show
          John George added a comment - > > Regarding tests, no matter how this is fixed, you should have one that calls > TokenCache.obtainTokensForNamenodes with a non-server based authority uri > (ie. a har path) since that's what is breaking. To me (based on the two patches uploaded until now), this is a unit test that specifically tests whether buildDTServiceName() is dealing with non-server based authority uri properly or not - because that is specifically where the NPE was. To test obtainTokensForNamenodes() or any function above would be good to have, but does not look like it is specific to the patch that is submitted herewith. But, if the changes you suggested are made, then it does makes sense to unit test ObjectTokensForNamenodes as well. > > I consider the pre MR-2780 behavior of buildDTServiceAuthority is a bug. > The method is not documented as returning an empty string when given bad > input. Are you saying everything that is not documented is a bug? On a serious note, when you say the "pre MR-2780" behavior is a bug - are you saying there were issues that the users of buildDTServiceAuthority ran into because it was returning a string that was not "host:port"? Or are you saying it could have had issues? If so, what issues? > > The TokenCache should stop erroneously calling it. However, even if the > TokenCache isn't going to erroneously call the method anymore, I feel > there is still value in having buildDTServiceAuthority throw on invalid > input instead of returning an invalid value. Otherwise, an invalid return > value will only cause an error to crop up someplace else. What you are stating above definitely makes sense in trunk - but in a sustaining release it seems better to make minimal changes to fix bugs?
          Hide
          Daryn Sharp added a comment -

          Regarding tests, no matter how this is fixed, you should have one that calls TokenCache.obtainTokensForNamenodes with a non-server based authority uri (ie. a har path) since that's what is breaking.

          I consider the pre MR-2780 behavior of buildDTServiceAuthority is a bug. The method is not documented as returning an empty string when given bad input.

          The TokenCache should stop erroneously calling it. However, even if the TokenCache isn't going to erroneously call the method anymore, I feel there is still value in having buildDTServiceAuthority throw on invalid input instead of returning an invalid value. Otherwise, an invalid return value will only cause an error to crop up someplace else.

          Show
          Daryn Sharp added a comment - Regarding tests, no matter how this is fixed, you should have one that calls TokenCache.obtainTokensForNamenodes with a non-server based authority uri (ie. a har path) since that's what is breaking. I consider the pre MR-2780 behavior of buildDTServiceAuthority is a bug. The method is not documented as returning an empty string when given bad input. The TokenCache should stop erroneously calling it. However, even if the TokenCache isn't going to erroneously call the method anymore, I feel there is still value in having buildDTServiceAuthority throw on invalid input instead of returning an invalid value. Otherwise, an invalid return value will only cause an error to crop up someplace else.
          Hide
          John George added a comment -

          Attaching a patch with unit test (but without the changes Daryn suggested). This patch was created before I read Daryn's comment.

          [exec]
          [exec] BUILD SUCCESSFUL
          [exec] Total time: 6 minutes 23 seconds
          [exec]
          [exec]
          [exec]
          [exec]
          [exec] +1 overall.
          [exec]
          [exec] +1 @author. The patch does not contain any @author tags.
          [exec]
          [exec] +1 tests included. The patch appears to include 3 new or modified tests.
          [exec]
          [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
          [exec]
          [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
          [exec]
          [exec] +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.
          [exec]
          [exec]
          [exec]
          [exec]
          [exec] ======================================================================
          [exec] ======================================================================
          [exec] Finished build.
          [exec] ======================================================================
          [exec] ======================================================================
          [exec]

          Show
          John George added a comment - Attaching a patch with unit test (but without the changes Daryn suggested). This patch was created before I read Daryn's comment. [exec] [exec] BUILD SUCCESSFUL [exec] Total time: 6 minutes 23 seconds [exec] [exec] [exec] [exec] [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. [exec] [exec] [exec] [exec] [exec] ====================================================================== [exec] ====================================================================== [exec] Finished build. [exec] ====================================================================== [exec] ====================================================================== [exec]
          Hide
          John George added a comment -

          > Having buildDTServiceName return an empty string to satisfy bad behavior
          > in the TokenCache doesn't feel right. I'd prefer an exception like {{new
          > IllegalArgumentException( uri + "does not have a server-based authority)}} to
          > guard against accidental invocation in the future.
          >
          I am not knowledgeable enough to decide either way, but the reason that the above change were made in buildDTServiceName() is because it was doing something very similar before the changes in MR-2780. If the need to check is soon going to go away, would it be better to continue what it did before MR-2780?

          Show
          John George added a comment - > Having buildDTServiceName return an empty string to satisfy bad behavior > in the TokenCache doesn't feel right. I'd prefer an exception like {{new > IllegalArgumentException( uri + "does not have a server-based authority)}} to > guard against accidental invocation in the future. > I am not knowledgeable enough to decide either way, but the reason that the above change were made in buildDTServiceName() is because it was doing something very similar before the changes in MR-2780. If the need to check is soon going to go away, would it be better to continue what it did before MR-2780?
          Hide
          Daryn Sharp added a comment -

          I was actually working on this same problem... The issue is TokenCache is attempting to get tokens for a non-server based authority. buildDTServiceName is documented as returning ip:port. It's only called by a FileSystem to set its token's service.

          Having buildDTServiceName return an empty string to satisfy bad behavior in the TokenCache doesn't feel right. I'd prefer an exception like new IllegalArgumentException( uri + "does not have a server-based authority) to guard against accidental invocation in the future.

          Then back in TokenCache.obtainTokensForNamenodesInternal, add:

          if (fs.getUri().getAuthority() == null) continue;
          

          Otherwise the empty string would cause it to go through the machinations of trying to get a token.

          (Note: the need for this check should soon go away because the fixing of token renewal causes the TokenCache to only request tokens for a FileSystem that implements tokens)

          Show
          Daryn Sharp added a comment - I was actually working on this same problem... The issue is TokenCache is attempting to get tokens for a non-server based authority. buildDTServiceName is documented as returning ip:port. It's only called by a FileSystem to set its token's service. Having buildDTServiceName return an empty string to satisfy bad behavior in the TokenCache doesn't feel right. I'd prefer an exception like new IllegalArgumentException( uri + "does not have a server-based authority) to guard against accidental invocation in the future. Then back in TokenCache.obtainTokensForNamenodesInternal , add: if (fs.getUri().getAuthority() == null ) continue ; Otherwise the empty string would cause it to go through the machinations of trying to get a token. (Note: the need for this check should soon go away because the fixing of token renewal causes the TokenCache to only request tokens for a FileSystem that implements tokens)
          Hide
          Mahadev konar added a comment -

          John,
          I think we should get this into 205? Can you please mark the fix version as 205?

          Show
          Mahadev konar added a comment - John, I think we should get this into 205? Can you please mark the fix version as 205?
          Hide
          John George added a comment -

          Attaching a patch for preliminary review. Need to write tests and ensure nothing else is broken by running other tests...

          URI for har files look like this: har:/tmp/archive_1.har. uri.getAuthority() in this case returns null, which get passed to createSocketAddr() by buildDTServiceName(). Before MAPREDUCE-2780, buildDTServiceName() used to check for null values and replace with empty string if necessary. This check needs to continue happening.

          Show
          John George added a comment - Attaching a patch for preliminary review. Need to write tests and ensure nothing else is broken by running other tests... URI for har files look like this: har:/tmp/archive_1.har. uri.getAuthority() in this case returns null, which get passed to createSocketAddr() by buildDTServiceName(). Before MAPREDUCE-2780 , buildDTServiceName() used to check for null values and replace with empty string if necessary. This check needs to continue happening.
          Hide
          John George added a comment -

          The NPE is as follows:

          11/08/13 03:51:42 INFO mapred.JobClient: Cleaning up the staging area
          hdfs://<hostname/user_name>/.staging/job_<id>
          java.lang.NullPointerException
          at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:137)
          at org.apache.hadoop.security.SecurityUtil.buildDTServiceName(SecurityUtil.java:261)
          at org.apache.hadoop.fs.FileSystem.getCanonicalServiceName(FileSystem.java:161)
          at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:92)
          at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:79)
          at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:163)
          at org.apache.hadoop.mapred.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:44)
          at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:211)
          at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:946)
          at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:938)
          at org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170)
          at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:854)
          at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:807)
          at java.security.AccessController.doPrivileged(Native Method)
          at javax.security.auth.Subject.doAs(Subject.java:396)
          at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
          at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:807)
          at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:781)
          at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1218)
          at org.apache.hadoop.examples.Sort.run(Sort.java:176)
          at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
          at org.apache.hadoop.examples.Sort.main(Sort.java:187)
          at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
          at java.lang.reflect.Method.invoke(Method.java:597)
          at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
          at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
          at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
          at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
          at java.lang.reflect.Method.invoke(Method.java:597)
          at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

          Show
          John George added a comment - The NPE is as follows: 11/08/13 03:51:42 INFO mapred.JobClient: Cleaning up the staging area hdfs://<hostname/user_name>/.staging/job_<id> java.lang.NullPointerException at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:137) at org.apache.hadoop.security.SecurityUtil.buildDTServiceName(SecurityUtil.java:261) at org.apache.hadoop.fs.FileSystem.getCanonicalServiceName(FileSystem.java:161) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:92) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:79) at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:163) at org.apache.hadoop.mapred.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:44) at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:211) at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:946) at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:938) at org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:854) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:807) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:807) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:781) at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1218) at org.apache.hadoop.examples.Sort.run(Sort.java:176) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.examples.Sort.main(Sort.java:187) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68) at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139) at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

            People

            • Assignee:
              John George
              Reporter:
              John George
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development