Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-6475

WebHdfs clients fail without retry because incorrect handling of StandbyException

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 2.4.0
    • Fix Version/s: 2.5.0
    • Component/s: ha, webhdfs
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      With WebHdfs clients connected to a HA HDFS service, the delegation token is previously initialized with the active NN.

      When clients try to issue request, the NN it contacts is stored in a map returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact the NN based on the order, so likely the first one it runs into is StandbyNN. If the StandbyNN doesn't have the updated client crediential, it will throw a s SecurityException that wraps StandbyException.

      The client is expected to retry another NN, but due to the insufficient handling of SecurityException mentioned above, it failed.

      Example message:

      {RemoteException={message=Failed to obtain user group information: org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException, javaCl
      assName=java.lang.SecurityException, exception=SecurityException}}
      
      org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to obtain user group information: org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
              at org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
              at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
              at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107)
              at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
              at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
              at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
              at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
              at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
              at kclient1.kclient$1.run(kclient.java:64)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:356)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
              at kclient1.kclient.main(kclient.java:58)
              at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
              at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:606)
              at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
      
      1. HDFS-6475.009.patch
        8 kB
        Yongjun Zhang
      2. HDFS-6475.008.patch
        8 kB
        Yongjun Zhang
      3. HDFS-6475.007.patch
        10 kB
        Yongjun Zhang
      4. HDFS-6475.006.patch
        13 kB
        Yongjun Zhang
      5. HDFS-6475.005.patch
        12 kB
        Yongjun Zhang
      6. HDFS-6475.004.patch
        11 kB
        Yongjun Zhang
      7. HDFS-6475.003.patch
        11 kB
        Yongjun Zhang
      8. HDFS-6475.003.patch
        11 kB
        Yongjun Zhang
      9. HDFS-6475.002.patch
        5 kB
        Yongjun Zhang
      10. HDFS-6475.001.patch
        2 kB
        Yongjun Zhang

        Activity

        Hide
        Hudson added a comment -

        FAILURE: Integrated in Hadoop-Mapreduce-trunk #1812 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1812/)
        HDFS-6475. WebHdfs clients fail without retry because incorrect handling of StandbyException. Contributed by Yongjun Zhang. (atm: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1605217)

        • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
        • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/ExceptionHandler.java
        • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestDelegationTokensWithHA.java
        Show
        Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk #1812 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1812/ ) HDFS-6475 . WebHdfs clients fail without retry because incorrect handling of StandbyException. Contributed by Yongjun Zhang. (atm: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1605217 ) /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/ExceptionHandler.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestDelegationTokensWithHA.java
        Hide
        Hudson added a comment -

        SUCCESS: Integrated in Hadoop-Hdfs-trunk #1785 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1785/)
        HDFS-6475. WebHdfs clients fail without retry because incorrect handling of StandbyException. Contributed by Yongjun Zhang. (atm: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1605217)

        • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
        • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/ExceptionHandler.java
        • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestDelegationTokensWithHA.java
        Show
        Hudson added a comment - SUCCESS: Integrated in Hadoop-Hdfs-trunk #1785 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1785/ ) HDFS-6475 . WebHdfs clients fail without retry because incorrect handling of StandbyException. Contributed by Yongjun Zhang. (atm: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1605217 ) /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/ExceptionHandler.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestDelegationTokensWithHA.java
        Hide
        Hudson added a comment -

        FAILURE: Integrated in Hadoop-Yarn-trunk #594 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/594/)
        HDFS-6475. WebHdfs clients fail without retry because incorrect handling of StandbyException. Contributed by Yongjun Zhang. (atm: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1605217)

        • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
        • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/ExceptionHandler.java
        • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestDelegationTokensWithHA.java
        Show
        Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk #594 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/594/ ) HDFS-6475 . WebHdfs clients fail without retry because incorrect handling of StandbyException. Contributed by Yongjun Zhang. (atm: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1605217 ) /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/ExceptionHandler.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestDelegationTokensWithHA.java
        Hide
        Yongjun Zhang added a comment -

        Thanks a lot Aaron T. Myers!

        Many thanks to Daryn Sharp and Jing Zhao for the review and comments. I will follow up with the getTrueCause issue in HDFS-6588.

        Show
        Yongjun Zhang added a comment - Thanks a lot Aaron T. Myers ! Many thanks to Daryn Sharp and Jing Zhao for the review and comments. I will follow up with the getTrueCause issue in HDFS-6588 .
        Hide
        Hudson added a comment -

        SUCCESS: Integrated in Hadoop-trunk-Commit #5774 (See https://builds.apache.org/job/Hadoop-trunk-Commit/5774/)
        HDFS-6475. WebHdfs clients fail without retry because incorrect handling of StandbyException. Contributed by Yongjun Zhang. (atm: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1605217)

        • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
        • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/ExceptionHandler.java
        • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestDelegationTokensWithHA.java
        Show
        Hudson added a comment - SUCCESS: Integrated in Hadoop-trunk-Commit #5774 (See https://builds.apache.org/job/Hadoop-trunk-Commit/5774/ ) HDFS-6475 . WebHdfs clients fail without retry because incorrect handling of StandbyException. Contributed by Yongjun Zhang. (atm: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1605217 ) /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/ExceptionHandler.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestDelegationTokensWithHA.java
        Hide
        Aaron T. Myers added a comment -

        I've just committed this to trunk and branch-2.

        Thanks a lot for the contribution, Yongjun, and thanks a lot to Daryn and Jing for the reviews.

        Show
        Aaron T. Myers added a comment - I've just committed this to trunk and branch-2. Thanks a lot for the contribution, Yongjun, and thanks a lot to Daryn and Jing for the reviews.
        Hide
        Aaron T. Myers added a comment -

        +1, the latest patch looks good to me. I'm going to commit this momentarily.

        We can do the follow-up to clean up the code as Daryn suggests in the other JIRA that Yongjun has filed.

        Show
        Aaron T. Myers added a comment - +1, the latest patch looks good to me. I'm going to commit this momentarily. We can do the follow-up to clean up the code as Daryn suggests in the other JIRA that Yongjun has filed.
        Hide
        Aaron T. Myers added a comment -

        Hey Daryn, given Yongjun's investigation, seems like this is a bit more complex than originally suggested. Unless you object, I'm going to go ahead and check this in tomorrow so we can get this fixed and move on.

        Show
        Aaron T. Myers added a comment - Hey Daryn, given Yongjun's investigation, seems like this is a bit more complex than originally suggested. Unless you object, I'm going to go ahead and check this in tomorrow so we can get this fixed and move on.
        Hide
        Yongjun Zhang added a comment -

        HI Daryn Sharp,

        Would like to check with you, you mentioned

        If it turns out to be a lot more complicated, then perhaps a followup jira is ok

        Based on the information we have so far, the work involved is to remove getTrueCause, and replace the retrievePassword with retriableRetrievePassword, changing the interface spec of relevant methods because retriableRetrievePassword throws more exceptions, removing getTrueCause method, fixing the test failures reported in HDFS-6588.

        I hope you'd agree that it's appropriate to dedicate HDFS-6588 for the above mentioned work, and use the lastest patch I posted for HDFS-6475 to handle the SecurityException that UserProvider throws. Would you please comment again? Thanks.

        BTW, thanks Jing Zhao for clarifying things in HDFS-6588 (sorry I had a typo in last update as 6589).

        Show
        Yongjun Zhang added a comment - HI Daryn Sharp , Would like to check with you, you mentioned If it turns out to be a lot more complicated, then perhaps a followup jira is ok Based on the information we have so far, the work involved is to remove getTrueCause, and replace the retrievePassword with retriableRetrievePassword, changing the interface spec of relevant methods because retriableRetrievePassword throws more exceptions, removing getTrueCause method, fixing the test failures reported in HDFS-6588 . I hope you'd agree that it's appropriate to dedicate HDFS-6588 for the above mentioned work, and use the lastest patch I posted for HDFS-6475 to handle the SecurityException that UserProvider throws. Would you please comment again? Thanks. BTW, thanks Jing Zhao for clarifying things in HDFS-6588 (sorry I had a typo in last update as 6589).
        Hide
        Yongjun Zhang added a comment -

        HI Daryn Sharp,

        Thanks a lot for the comments. Calling DelegationTokenSecretManager#retrievePassword is the sole place I have seen.

        And, the following method in AbstractDelegationTokenSecretManager is where retrievePassword is called,

          public synchronized void verifyToken(TokenIdent identifier, byte[] password)
              throws InvalidToken {
            byte[] storedPassword = retrievePassword(identifier);
            if (!Arrays.equals(password, storedPassword)) {
              throw new InvalidToken("token (" + identifier
                  + ") is invalid, password doesn't match");
            }
          }
        

        I wonder whether we can just replace the above retrievePassword call with retriableRetrievePassword here. I will give it a try.

        The failed tests are reported in HDFS-6589, related to HDFS-5322. Hi Jing Zhao, I put a question in HDFS-6589. I wonder if the failed tests are designed to cover real user scenarios? Thanks for clarifying.

        Best regards.

        Show
        Yongjun Zhang added a comment - HI Daryn Sharp , Thanks a lot for the comments. Calling DelegationTokenSecretManager#retrievePassword is the sole place I have seen. And, the following method in AbstractDelegationTokenSecretManager is where retrievePassword is called, public synchronized void verifyToken(TokenIdent identifier, byte [] password) throws InvalidToken { byte [] storedPassword = retrievePassword(identifier); if (!Arrays.equals(password, storedPassword)) { throw new InvalidToken( "token (" + identifier + ") is invalid, password doesn't match" ); } } I wonder whether we can just replace the above retrievePassword call with retriableRetrievePassword here. I will give it a try. The failed tests are reported in HDFS-6589 , related to HDFS-5322 . Hi Jing Zhao , I put a question in HDFS-6589 . I wonder if the failed tests are designed to cover real user scenarios? Thanks for clarifying. Best regards.
        Hide
        Daryn Sharp added a comment -

        Your earlier suggestion indicated that we should use SecretManager#retriableRetrievePassword instead of SecretManager#retrievePassword, does that mean client code has to be modified?

        If I understand the question: The methods are only used server-side so no client-side changes should be required, so no incompatibility concerns.

        Did you happen to trace how/where the StandbyException is wrapped in an InvalidToken? It looks like DelegationTokenSecretManager#retrievePassword is the only place it occurs, but DelegationTokenSecretManager#retriableRetrievePassword does not wrap exceptions in InvalidToken.

        Is this maybe just a test case issue? Which testcase is failing?

        Show
        Daryn Sharp added a comment - Your earlier suggestion indicated that we should use SecretManager#retriableRetrievePassword instead of SecretManager#retrievePassword, does that mean client code has to be modified? If I understand the question: The methods are only used server-side so no client-side changes should be required, so no incompatibility concerns. Did you happen to trace how/where the StandbyException is wrapped in an InvalidToken ? It looks like DelegationTokenSecretManager#retrievePassword is the only place it occurs, but DelegationTokenSecretManager#retriableRetrievePassword does not wrap exceptions in InvalidToken . Is this maybe just a test case issue? Which testcase is failing?
        Hide
        Yongjun Zhang added a comment -

        Filed HDFS-6589 for the flaky test TestDistributedFileSystem.testAllWithNoXmlDefaults.

        Show
        Yongjun Zhang added a comment - Filed HDFS-6589 for the flaky test TestDistributedFileSystem.testAllWithNoXmlDefaults.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12651893/HDFS-6475.009.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 1 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs:

        org.apache.hadoop.hdfs.TestDistributedFileSystem

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7208//testReport/
        Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7208//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12651893/HDFS-6475.009.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestDistributedFileSystem +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7208//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7208//console This message is automatically generated.
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12651890/HDFS-6475.008.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 1 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7207//testReport/
        Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7207//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12651890/HDFS-6475.008.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7207//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7207//console This message is automatically generated.
        Hide
        Yongjun Zhang added a comment -

        Uploaded 009 to share one line of code.

        Show
        Yongjun Zhang added a comment - Uploaded 009 to share one line of code.
        Hide
        Yongjun Zhang added a comment -

        HI Daryn Sharp, Jing Zhao and Aaron T. Myers,

        Many thanks to you guys for earlier review and comments. I just uploaded a new revision (008) to address the comments and testing errors.
        In summary,

        • Per Daryn's suggestion, I attempted to remove getTrueCause() method from Server.java as a whole, ran into test failures. After spending quite some time to look into, I personally really removing the existing getTrueCauseMethod really deserves a new JIRA, I filed HDFS-6588 with details and questions. I hope you'd agree based on the information I provided there, but I'm certainly open for further discussion.
        • The new patch I just uploaded for HDFS-6475 is limited to handle the case reported in this JIRA. It is only a few lines in ExceptionHandler.java, plus the testcase I added. It no longer calls getTrueCause() method defined in Server.java.

        Thanks a lot for follow-up.

        Show
        Yongjun Zhang added a comment - HI Daryn Sharp , Jing Zhao and Aaron T. Myers , Many thanks to you guys for earlier review and comments. I just uploaded a new revision (008) to address the comments and testing errors. In summary, Per Daryn's suggestion, I attempted to remove getTrueCause() method from Server.java as a whole, ran into test failures. After spending quite some time to look into, I personally really removing the existing getTrueCauseMethod really deserves a new JIRA, I filed HDFS-6588 with details and questions. I hope you'd agree based on the information I provided there, but I'm certainly open for further discussion. The new patch I just uploaded for HDFS-6475 is limited to handle the case reported in this JIRA. It is only a few lines in ExceptionHandler.java, plus the testcase I added. It no longer calls getTrueCause() method defined in Server.java. Thanks a lot for follow-up.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12651811/HDFS-6475.007.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 1 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

        org.apache.hadoop.ipc.TestSaslRPC
        org.apache.hadoop.hdfs.web.TestWebHdfsTokens

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7198//testReport/
        Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7198//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12651811/HDFS-6475.007.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.ipc.TestSaslRPC org.apache.hadoop.hdfs.web.TestWebHdfsTokens +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7198//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7198//console This message is automatically generated.
        Hide
        Yongjun Zhang added a comment -

        Hi,

        I uploaded a new patch per Daryn Sharp's suggestion. On top of what Daryn suggested, and because of the exceptions I described in previous update (with stack information), I added the logic for InvalidToken in ExceptionHandler:

           if (e instanceof SecurityException) {
              e = toCause(e);
            }
            if (e instanceof InvalidToken) {
              e = toCause(e);
            }
        

        This logic is essentially what I wanted to share the original getTrueCause method for.

        Hi Daryn, would you please help review again?
        BTW, refer to your comment:

        In saslProcess, just throw the exception instead of running it through getTrueCause since it's not a "InvalidToken wrapping another exception" anymore.

        I did what you suggested, but I'm still getting InvalidToken exception (see the stack described above). So it seems that the exception that saslProcess tries to handle comes from different source than what I'm running into.

        Thanks a lot.

        Show
        Yongjun Zhang added a comment - Hi, I uploaded a new patch per Daryn Sharp 's suggestion. On top of what Daryn suggested, and because of the exceptions I described in previous update (with stack information), I added the logic for InvalidToken in ExceptionHandler: if (e instanceof SecurityException) { e = toCause(e); } if (e instanceof InvalidToken) { e = toCause(e); } This logic is essentially what I wanted to share the original getTrueCause method for. Hi Daryn, would you please help review again? BTW, refer to your comment: In saslProcess, just throw the exception instead of running it through getTrueCause since it's not a "InvalidToken wrapping another exception" anymore. I did what you suggested, but I'm still getting InvalidToken exception (see the stack described above). So it seems that the exception that saslProcess tries to handle comes from different source than what I'm running into. Thanks a lot.
        Hide
        Yongjun Zhang added a comment -

        Hello Guys,

        Thank you all for the review and discussion.

        As a follow-up, the first thing I did was to check that retriableRetrievePassword does give StandbyException.

        Then I tried what Daryn Sharp suggested how we could fix this jira, and still run into the same symptom (client side),:

        org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
        	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
        	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        	at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
        	at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
        	at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
        	at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toIOException(WebHdfsFileSystem.java:348)
        	at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$600(WebHdfsFileSystem.java:108)
        	at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.shouldRetry(WebHdfsFileSystem.java:580)
        	at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:546)
        	at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431)
        	at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685)
        	at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696)
        	at kclient1.kclient1$2.run(kclient1.java:80)
        	at java.security.AccessController.doPrivileged(Native Method)
        	at javax.security.auth.Subject.doAs(Subject.java:356)
        	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
        	at kclient1.kclient1.main(kclient1.java:74)
        	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        	at java.lang.reflect.Method.invoke(Method.java:606)
        	at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
        Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): StandbyException
        	at org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159)
        	at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325)
        	at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:108)
        	at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635)
        	at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542)
        

        I traced the server side and found the following:

        UserProvider.getValue caught org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException
        with the following stack

        UserProvider.getValue(HttpContext) line: 56
        UserProvider.getValue(HttpContext) line: 41
        InjectableValuesProvider.getInjectableValues(HttpContext) line: 46
        AbstractResourceMethodDispatchProvider$ResponseOutInvoker(AbstractResourceMethodDispatchProvider$EntityParamInInvoker).getParams(HttpContext) line: 153
        AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch(Object, HttpContext) line: 203
        AbstractResourceMethodDispatchProvider$ResponseOutInvoker(ResourceJavaMethodDispatcher).dispatch(Object, HttpContext) line: 75
        HttpMethodRule.accept(CharSequence, Object, UriRuleContext) line: 288
        ResourceClassRule.accept(CharSequence, Object, UriRuleContext) line: 108
        RightHandPathRule.accept(CharSequence, Object, UriRuleContext) line: 147
        RootResourceClassesRule.accept(CharSequence, Object, UriRuleContext) line: 84
        WebApplicationImpl._handleRequest(WebApplicationContext, ContainerRequest) line: 1469
        WebApplicationImpl._handleRequest(WebApplicationContext, ContainerRequest, ContainerResponse) line: 1400
        WebApplicationImpl.handleRequest(ContainerRequest, ContainerResponse) line: 1349
        WebApplicationImpl.handleRequest(ContainerRequest, ContainerResponseWriter) line: 1339
        ServletContainer$InternalWebComponent(WebComponent).service(URI, URI, HttpServletRequest, HttpServletResponse) line: 416
        ServletContainer.service(URI, URI, HttpServletRequest, HttpServletResponse) line: 537
        ServletContainer.service(HttpServletRequest, HttpServletResponse) line: 699
        ServletContainer(HttpServlet).service(ServletRequest, ServletResponse) line: 820
        ServletHolder.handle(ServletRequest, ServletResponse) line: 511
        ServletHandler$CachedChain.doFilter(ServletRequest, ServletResponse) line: 1221
        AuthFilter.doFilter(ServletRequest, ServletResponse, FilterChain) line: 82
        ServletHandler$CachedChain.doFilter(ServletRequest, ServletResponse) line: 1212
        HttpServer2$QuotingInputFilter.doFilter(ServletRequest, ServletResponse, FilterChain) line: 1183
        ServletHandler$CachedChain.doFilter(ServletRequest, ServletResponse) line: 1212
        NoCacheFilter.doFilter(ServletRequest, ServletResponse, FilterChain) line: 45
        ServletHandler$CachedChain.doFilter(ServletRequest, ServletResponse) line: 1212
        NoCacheFilter.doFilter(ServletRequest, ServletResponse, FilterChain) line: 45
        ServletHandler$CachedChain.doFilter(ServletRequest, ServletResponse) line: 1212
        ServletHandler.handle(String, HttpServletRequest, HttpServletResponse, int) line: 399
        SecurityHandler.handle(String, HttpServletRequest, HttpServletResponse, int) line: 216
        SessionHandler.handle(String, HttpServletRequest, HttpServletResponse, int) line: 182
        WebAppContext(ContextHandler).handle(String, HttpServletRequest, HttpServletResponse, int) line: 766
        WebAppContext.handle(String, HttpServletRequest, HttpServletResponse, int) line: 450
        ContextHandlerCollection.handle(String, HttpServletRequest, HttpServletResponse, int) line: 230
        Server(HandlerWrapper).handle(String, HttpServletRequest, HttpServletResponse, int) line: 152
        Server.handle(HttpConnection) line: 326
        HttpConnection.handleRequest() line: 542
        HttpConnection$RequestHandler.headerComplete() line: 928
        

        Upon catching this exceptoin, UserProvider wraps it with SecurtyException and throw.

        Then at ExceptionHandler, the first Exception caught is
        ContainerException with cause: com.sun.jersey.api.container.ContainerException: Exception obtaining parameters
        which in turn has cause
        java.lang.SecurityException: Failed to obtain user group information: org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException

        With the current code in ExceptionHandler, the ContainerExcepition's cause is first extracted, then with the three line code I added for SecurityExcetpion,
        the InvalidToken exception is extracted, it stops there, so the InvalidToken with cause StandbyException is sent back to client, and caused the client side exception described in the beginning of this comment.

        That's what explains the client side exception. I could add another logic in ExceptionHandler as

         
           if the exception is InvalidToken, 
             extract cause
        

        Since we can add quite a few logic like this, I guess maybe it will be fine. I will give it a try.
        Please let me know if you guys know of better option than this.

        Thanks a lot.

        Show
        Yongjun Zhang added a comment - Hello Guys, Thank you all for the review and discussion. As a follow-up, the first thing I did was to check that retriableRetrievePassword does give StandbyException. Then I tried what Daryn Sharp suggested how we could fix this jira, and still run into the same symptom (client side),: org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toIOException(WebHdfsFileSystem.java:348) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$600(WebHdfsFileSystem.java:108) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.shouldRetry(WebHdfsFileSystem.java:580) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:546) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696) at kclient1.kclient1$2.run(kclient1.java:80) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:356) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528) at kclient1.kclient1.main(kclient1.java:74) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): StandbyException at org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:108) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542) I traced the server side and found the following: UserProvider.getValue caught org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException with the following stack UserProvider.getValue(HttpContext) line: 56 UserProvider.getValue(HttpContext) line: 41 InjectableValuesProvider.getInjectableValues(HttpContext) line: 46 AbstractResourceMethodDispatchProvider$ResponseOutInvoker(AbstractResourceMethodDispatchProvider$EntityParamInInvoker).getParams(HttpContext) line: 153 AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch( Object , HttpContext) line: 203 AbstractResourceMethodDispatchProvider$ResponseOutInvoker(ResourceJavaMethodDispatcher).dispatch( Object , HttpContext) line: 75 HttpMethodRule.accept(CharSequence, Object , UriRuleContext) line: 288 ResourceClassRule.accept(CharSequence, Object , UriRuleContext) line: 108 RightHandPathRule.accept(CharSequence, Object , UriRuleContext) line: 147 RootResourceClassesRule.accept(CharSequence, Object , UriRuleContext) line: 84 WebApplicationImpl._handleRequest(WebApplicationContext, ContainerRequest) line: 1469 WebApplicationImpl._handleRequest(WebApplicationContext, ContainerRequest, ContainerResponse) line: 1400 WebApplicationImpl.handleRequest(ContainerRequest, ContainerResponse) line: 1349 WebApplicationImpl.handleRequest(ContainerRequest, ContainerResponseWriter) line: 1339 ServletContainer$InternalWebComponent(WebComponent).service(URI, URI, HttpServletRequest, HttpServletResponse) line: 416 ServletContainer.service(URI, URI, HttpServletRequest, HttpServletResponse) line: 537 ServletContainer.service(HttpServletRequest, HttpServletResponse) line: 699 ServletContainer(HttpServlet).service(ServletRequest, ServletResponse) line: 820 ServletHolder.handle(ServletRequest, ServletResponse) line: 511 ServletHandler$CachedChain.doFilter(ServletRequest, ServletResponse) line: 1221 AuthFilter.doFilter(ServletRequest, ServletResponse, FilterChain) line: 82 ServletHandler$CachedChain.doFilter(ServletRequest, ServletResponse) line: 1212 HttpServer2$QuotingInputFilter.doFilter(ServletRequest, ServletResponse, FilterChain) line: 1183 ServletHandler$CachedChain.doFilter(ServletRequest, ServletResponse) line: 1212 NoCacheFilter.doFilter(ServletRequest, ServletResponse, FilterChain) line: 45 ServletHandler$CachedChain.doFilter(ServletRequest, ServletResponse) line: 1212 NoCacheFilter.doFilter(ServletRequest, ServletResponse, FilterChain) line: 45 ServletHandler$CachedChain.doFilter(ServletRequest, ServletResponse) line: 1212 ServletHandler.handle( String , HttpServletRequest, HttpServletResponse, int ) line: 399 SecurityHandler.handle( String , HttpServletRequest, HttpServletResponse, int ) line: 216 SessionHandler.handle( String , HttpServletRequest, HttpServletResponse, int ) line: 182 WebAppContext(ContextHandler).handle( String , HttpServletRequest, HttpServletResponse, int ) line: 766 WebAppContext.handle( String , HttpServletRequest, HttpServletResponse, int ) line: 450 ContextHandlerCollection.handle( String , HttpServletRequest, HttpServletResponse, int ) line: 230 Server(HandlerWrapper).handle( String , HttpServletRequest, HttpServletResponse, int ) line: 152 Server.handle(HttpConnection) line: 326 HttpConnection.handleRequest() line: 542 HttpConnection$RequestHandler.headerComplete() line: 928 Upon catching this exceptoin, UserProvider wraps it with SecurtyException and throw. Then at ExceptionHandler, the first Exception caught is ContainerException with cause: com.sun.jersey.api.container.ContainerException: Exception obtaining parameters which in turn has cause java.lang.SecurityException: Failed to obtain user group information: org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException With the current code in ExceptionHandler, the ContainerExcepition's cause is first extracted, then with the three line code I added for SecurityExcetpion, the InvalidToken exception is extracted, it stops there, so the InvalidToken with cause StandbyException is sent back to client, and caused the client side exception described in the beginning of this comment. That's what explains the client side exception. I could add another logic in ExceptionHandler as if the exception is InvalidToken, extract cause Since we can add quite a few logic like this, I guess maybe it will be fine. I will give it a try. Please let me know if you guys know of better option than this. Thanks a lot.
        Hide
        Yongjun Zhang added a comment -

        HI Daryn Sharp,

        While I'm working on a revised version, I have a question. Your earlier suggestion indicated that we should use SecretManager#retriableRetrievePassword instead of SecretManager#retrievePassword, does that mean client code has to be modified? If so, is it a compatibility issue? thanks a lot.

        Show
        Yongjun Zhang added a comment - HI Daryn Sharp , While I'm working on a revised version, I have a question. Your earlier suggestion indicated that we should use SecretManager#retriableRetrievePassword instead of SecretManager#retrievePassword , does that mean client code has to be modified? If so, is it a compatibility issue? thanks a lot.
        Hide
        Yongjun Zhang added a comment -

        Many thanks Daryn Sharp, I will work on a new revision per your suggestion.

        Show
        Yongjun Zhang added a comment - Many thanks Daryn Sharp , I will work on a new revision per your suggestion.
        Hide
        Daryn Sharp added a comment -

        I'm +0 and defer to atm if the fix is harder than what I just suggested.

        Show
        Daryn Sharp added a comment - I'm +0 and defer to atm if the fix is harder than what I just suggested.
        Hide
        Daryn Sharp added a comment -

        What I'm saying is I think the patch adds too much unnecessary code. Filing an improvement to delete all but a few lines of the code changed in this patch seems a bit odd. I think you just need to:

        1. Delete getTrueCause entirely instead of moving it elsewhere
        2. In saslProcess, just throw the exception instead of running it through getTrueCause since it's not a "InvalidToken wrapping another exception" anymore.
        3. Keep your 3-line change to unwrap SecurityException in toResponse

        If it turns out to be a lot more complicated, then perhaps a followup jira is ok, but since what should be the cleanest fix is 4 lines of change it would be nice to try that first.

        Show
        Daryn Sharp added a comment - What I'm saying is I think the patch adds too much unnecessary code. Filing an improvement to delete all but a few lines of the code changed in this patch seems a bit odd. I think you just need to: Delete getTrueCause entirely instead of moving it elsewhere In saslProcess , just throw the exception instead of running it through getTrueCause since it's not a "InvalidToken wrapping another exception" anymore. Keep your 3-line change to unwrap SecurityException in toResponse If it turns out to be a lot more complicated, then perhaps a followup jira is ok, but since what should be the cleanest fix is 4 lines of change it would be nice to try that first.
        Hide
        Yongjun Zhang added a comment -

        HI Daryn Sharp,

        Thanks a lot for the review and the in-depth comments. Since the getTrueCause method is already there, and the patch I posted reused it to solve the problem. What you suggested would be an improvement of the solution, would you please consider that we commit this patch first, and file a follow-up jira to improve the solution? Really appreciate your help.

        Show
        Yongjun Zhang added a comment - HI Daryn Sharp , Thanks a lot for the review and the in-depth comments. Since the getTrueCause method is already there, and the patch I posted reused it to solve the problem. What you suggested would be an improvement of the solution, would you please consider that we commit this patch first, and file a follow-up jira to improve the solution? Really appreciate your help.
        Hide
        Daryn Sharp added a comment -

        I don't think getTrueCause is necessary anymore and can be removed. It was born when SaslDigestCallbackHandler used to call SecretManager#retrievePassword which could only throw InvalidToken. A standby/retriable/etc exception had to be tunneled out via an InvalidToken cause. Now there's apparently a SecretManager#retriableRetrievePassword that can throw those exceptions directly.

        The problem comes down to an abuse of the injectable UserProvider to do authentication instead of via an authentication filter... Perhaps SecurityException is the only exception for which the cause needs to be extracted.

        Show
        Daryn Sharp added a comment - I don't think getTrueCause is necessary anymore and can be removed. It was born when SaslDigestCallbackHandler used to call SecretManager#retrievePassword which could only throw InvalidToken . A standby/retriable/etc exception had to be tunneled out via an InvalidToken cause. Now there's apparently a SecretManager#retriableRetrievePassword that can throw those exceptions directly. The problem comes down to an abuse of the injectable UserProvider to do authentication instead of via an authentication filter... Perhaps SecurityException is the only exception for which the cause needs to be extracted.
        Hide
        Daryn Sharp added a comment -

        Taking a look now.

        Show
        Daryn Sharp added a comment - Taking a look now.
        Hide
        Aaron T. Myers added a comment -

        +1, the latest patch looks good to me.

        I'll commit this tomorrow unless anyone else has comments in the meantime.

        Show
        Aaron T. Myers added a comment - +1, the latest patch looks good to me. I'll commit this tomorrow unless anyone else has comments in the meantime.
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12651015/HDFS-6475.006.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 1 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7160//testReport/
        Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7160//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12651015/HDFS-6475.006.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7160//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7160//console This message is automatically generated.
        Hide
        Yongjun Zhang added a comment -

        Thanks Daryn Sharp. I didn't find a good existing file for that, so introduced a new one called RpcServerUtil.java. Hope it looks fine to you guys. Uploading version 006.

        Show
        Yongjun Zhang added a comment - Thanks Daryn Sharp . I didn't find a good existing file for that, so introduced a new one called RpcServerUtil.java. Hope it looks fine to you guys. Uploading version 006.
        Hide
        Daryn Sharp added a comment -

        Is a static method in InvalidToken really the right place to check and handle other non-InvalidToken exceptions?

        Show
        Daryn Sharp added a comment - Is a static method in InvalidToken really the right place to check and handle other non-InvalidToken exceptions?
        Hide
        Jing Zhao added a comment -

        Yeah, +1

        Show
        Jing Zhao added a comment - Yeah, +1
        Hide
        Aaron T. Myers added a comment -

        Latest patch looks pretty good to me. Jing Zhao?

        Show
        Aaron T. Myers added a comment - Latest patch looks pretty good to me. Jing Zhao ?
        Hide
        Yongjun Zhang added a comment -

        BTW, the testLargeFile failure is intermittent and I updated HDFS_6543 about it.

        Show
        Yongjun Zhang added a comment - BTW, the testLargeFile failure is intermittent and I updated HDFS_6543 about it.
        Hide
        Yongjun Zhang added a comment -

        Running the two failed test locally, I'm seeing one passed and the other failed with today's trunk without my change. Filed HDFS-6543.

        Specifically:
        org.apache.hadoop.hdfs.server.balancer.TestBalancerWithEncryptedTransfer.testEncryptedBalancer2 passed, and
        org.apache.hadoop.hdfs.web.TestWebHDFS.testLargeFile failed with and without the fix, I filed HDFS-6543.

        Thanks.

        Show
        Yongjun Zhang added a comment - Running the two failed test locally, I'm seeing one passed and the other failed with today's trunk without my change. Filed HDFS-6543 . Specifically: org.apache.hadoop.hdfs.server.balancer.TestBalancerWithEncryptedTransfer.testEncryptedBalancer2 passed, and org.apache.hadoop.hdfs.web.TestWebHDFS.testLargeFile failed with and without the fix, I filed HDFS-6543 . Thanks.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12650507/HDFS-6475.005.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 1 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

        org.apache.hadoop.hdfs.web.TestWebHDFS
        org.apache.hadoop.hdfs.server.balancer.TestBalancerWithEncryptedTransfer

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7126//testReport/
        Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7126//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12650507/HDFS-6475.005.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.web.TestWebHDFS org.apache.hadoop.hdfs.server.balancer.TestBalancerWithEncryptedTransfer +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7126//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7126//console This message is automatically generated.
        Hide
        Yongjun Zhang added a comment -

        Hi Jing Zhao,

        Thanks a lot for the review and the suggestion! I agree InvalidToken is indeed a good place to hold this util method. I made this change and uploaded patch 005.

        Show
        Yongjun Zhang added a comment - Hi Jing Zhao , Thanks a lot for the review and the suggestion! I agree InvalidToken is indeed a good place to hold this util method. I made this change and uploaded patch 005.
        Hide
        Jing Zhao added a comment -

        Thanks for the fix, Yongjun Zhang! The patch looks good to me. My only concern is that if we can put the getTrueCause method into some Util class so that we do not need to call a method defined in ipc.Server from a class only used by webhdfs. I guess we can even move it into the InvalidToken class?

        Show
        Jing Zhao added a comment - Thanks for the fix, Yongjun Zhang ! The patch looks good to me. My only concern is that if we can put the getTrueCause method into some Util class so that we do not need to call a method defined in ipc.Server from a class only used by webhdfs. I guess we can even move it into the InvalidToken class?
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12650438/HDFS-6475.004.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 1 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7121//testReport/
        Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7121//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12650438/HDFS-6475.004.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7121//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7121//console This message is automatically generated.
        Hide
        Yongjun Zhang added a comment -

        Uploaded patch version 004.

        Show
        Yongjun Zhang added a comment - Uploaded patch version 004.
        Hide
        Yongjun Zhang added a comment -

        Hi ATM,

        Thanks a lot for the review! Sorry I didn't make it clear earlier. The change in ExceptionHandler does happen on the server side. Basically the ExceptionHandler class processes the original exception thrown at server side and pass a possibly revised exception to client. The original exception thrown at the server side is SecurityException (from UserProvider class) which has cause InvalidToken which in turn has cause StandbyException, The ExceptionHandler processes and and pass StandbyException to client.

        I'm uploading a revised version to address both of your comments. Hopefully the revised comments made it more clear. Thanks in advance for reviewing the new revision!

        Show
        Yongjun Zhang added a comment - Hi ATM, Thanks a lot for the review! Sorry I didn't make it clear earlier. The change in ExceptionHandler does happen on the server side. Basically the ExceptionHandler class processes the original exception thrown at server side and pass a possibly revised exception to client. The original exception thrown at the server side is SecurityException (from UserProvider class) which has cause InvalidToken which in turn has cause StandbyException, The ExceptionHandler processes and and pass StandbyException to client. I'm uploading a revised version to address both of your comments. Hopefully the revised comments made it more clear. Thanks in advance for reviewing the new revision!
        Hide
        Aaron T. Myers added a comment -

        The latest patch looks pretty good to me. Two very small comments:

        1. I think that the method comment for testDelegationTokenStandbyNNAppearFirst is a bit misleading. Seems like it's implying that the Standby NN is now throwing a different exception, when in fact I believe that the exception that's thrown is not changed by this patch, but rather that the client-side handling of the unwrapping of the exception is changed.
        2. There should be no need to restore the state of the standby/active NNs at the end of the test, since the cluster is always shut down at the end of every test in this class.

        +1 from me once the above are addressed.

        Daryn Sharp and Jing Zhao - does the latest patch look OK to you?

        Show
        Aaron T. Myers added a comment - The latest patch looks pretty good to me. Two very small comments: I think that the method comment for testDelegationTokenStandbyNNAppearFirst is a bit misleading. Seems like it's implying that the Standby NN is now throwing a different exception, when in fact I believe that the exception that's thrown is not changed by this patch, but rather that the client-side handling of the unwrapping of the exception is changed. There should be no need to restore the state of the standby/active NNs at the end of the test, since the cluster is always shut down at the end of every test in this class. +1 from me once the above are addressed. Daryn Sharp and Jing Zhao - does the latest patch look OK to you?
        Hide
        Yongjun Zhang added a comment -

        Filed HDFS-6518 (TestCacheDirectives#testExceedsCapacity fails intermittently).

        Show
        Yongjun Zhang added a comment - Filed HDFS-6518 (TestCacheDirectives#testExceedsCapacity fails intermittently).
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12649805/HDFS-6475.003.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 1 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7082//testReport/
        Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7082//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12649805/HDFS-6475.003.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7082//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7082//console This message is automatically generated.
        Hide
        Yongjun Zhang added a comment -

        The failed test TestCacheDirectives.testExceedsCapacity seems to be irrelevant and running the failed test locally succeeded, uploading the same patch and try again.

        I saw a similar jira HDFS-6257 for this testcase fixed by Colin Patrick McCabe but the symptom is different. Hi Colin, would you please take a quick look to see if it's indeed an issue (such as because of race condition)? I can file a jira if so.

        Thanks.

        Show
        Yongjun Zhang added a comment - The failed test TestCacheDirectives.testExceedsCapacity seems to be irrelevant and running the failed test locally succeeded, uploading the same patch and try again. I saw a similar jira HDFS-6257 for this testcase fixed by Colin Patrick McCabe but the symptom is different. Hi Colin, would you please take a quick look to see if it's indeed an issue (such as because of race condition)? I can file a jira if so. Thanks.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12649756/HDFS-6475.003.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 1 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

        org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7080//testReport/
        Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7080//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12649756/HDFS-6475.003.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7080//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7080//console This message is automatically generated.
        Hide
        Yongjun Zhang added a comment -

        Hi Guys,

        I managed to create a testcase and uploaded patch version 003. Would you please help review it again?

        Hi Daryn Sharp, I found that ExceptionHandler already did the following import from ipc area:

         import org.apache.hadoop.ipc.RemoteException;
        

        which means the webhdfs servlet already has dependency on ipc package. So I didn't duplicate the code.
        would you please let me know whether that make sense?

        Thanks a lot.

        Show
        Yongjun Zhang added a comment - Hi Guys, I managed to create a testcase and uploaded patch version 003. Would you please help review it again? Hi Daryn Sharp , I found that ExceptionHandler already did the following import from ipc area: import org.apache.hadoop.ipc.RemoteException; which means the webhdfs servlet already has dependency on ipc package. So I didn't duplicate the code. would you please let me know whether that make sense? Thanks a lot.
        Hide
        Yongjun Zhang added a comment -

        HI Daryn Sharp,

        Thanks a lot for the review and comments. I attempted to let UserProvider class throw different exception earlier, and found it inherits from classes of jersey package, which we won't be able to change the interface spec. I could just make a duplicated copy of the IPC server code to the ExceptionHandler area so they are not tied to each other. I'm open for that.

        Show
        Yongjun Zhang added a comment - HI Daryn Sharp , Thanks a lot for the review and comments. I attempted to let UserProvider class throw different exception earlier, and found it inherits from classes of jersey package, which we won't be able to change the interface spec. I could just make a duplicated copy of the IPC server code to the ExceptionHandler area so they are not tied to each other. I'm open for that.
        Hide
        Daryn Sharp added a comment -

        I had to hack around this problem in HDFS-6222... I'm a bit uneasy about tying the webhdfs servlet to the IPC server. I'd rather see the logic contained within webhdfs. I think UserProvider should throw a different exception that its ExceptionHandler specially knows to unwrap.

        Show
        Daryn Sharp added a comment - I had to hack around this problem in HDFS-6222 ... I'm a bit uneasy about tying the webhdfs servlet to the IPC server. I'd rather see the logic contained within webhdfs. I think UserProvider should throw a different exception that its ExceptionHandler specially knows to unwrap.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12649538/HDFS-6475.002.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

        org.apache.hadoop.hdfs.web.TestWebHDFS

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7071//testReport/
        Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7071//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12649538/HDFS-6475.002.patch against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.web.TestWebHDFS +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7071//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7071//console This message is automatically generated.
        Hide
        Yongjun Zhang added a comment -

        Hello Jing Zhao,

        Thanks for your earlier suggestion. Sorry for a bit delay to take care of it.

        I just uploaded a patch. I was able to verify it works with a real cluster where I saw the problem and to see the patch fixed the issue. However, I was not successful creating a testcase for it. Since this new patch reused the method getTrueCause() in Server.java, the remaining thing to be checked by a unit test would be the change I made in ExceptionHandler.

        The change in ExceptionHandler is, for ContainerException and SecurityException, call getTrueCause() to find the real exception based on the "cause" chain of the ContainerException/SecurityException. The original code in ExceptionHandler only does one level of cause-seeking for ContainerException.

        Would you please help take a look at the patch to see if this patch can be committed without a unit testcase? or if you have any other advice?

        Thanks a lot.

        Show
        Yongjun Zhang added a comment - Hello Jing Zhao , Thanks for your earlier suggestion. Sorry for a bit delay to take care of it. I just uploaded a patch. I was able to verify it works with a real cluster where I saw the problem and to see the patch fixed the issue. However, I was not successful creating a testcase for it. Since this new patch reused the method getTrueCause() in Server.java, the remaining thing to be checked by a unit test would be the change I made in ExceptionHandler. The change in ExceptionHandler is, for ContainerException and SecurityException, call getTrueCause() to find the real exception based on the "cause" chain of the ContainerException/SecurityException. The original code in ExceptionHandler only does one level of cause-seeking for ContainerException. Would you please help take a look at the patch to see if this patch can be committed without a unit testcase? or if you have any other advice? Thanks a lot.
        Hide
        Yongjun Zhang added a comment -

        Hello Jing, many thanks for the very helpful info, I'm behind with other schedule, will look at this a bit later.

        Show
        Yongjun Zhang added a comment - Hello Jing, many thanks for the very helpful info, I'm behind with other schedule, will look at this a bit later.
        Hide
        Jing Zhao added a comment -

        Hi Yongjun Zhang, thanks for the response and figuring out the issue. Maybe we can put the true cause logic into ExceptionHandler#toResponse?

        Show
        Jing Zhao added a comment - Hi Yongjun Zhang , thanks for the response and figuring out the issue. Maybe we can put the true cause logic into ExceptionHandler#toResponse?
        Hide
        Yongjun Zhang added a comment -

        Hi Jing,

        Thanks a lot for the info! I took a quick look, the issue is similar but seems there is an important difference here. That is,
        In HDFS-5322 fix, the method (and all caller hierarchy)

          private void saslProcess(RpcSaslProto saslMessage)
                throws WrappedRpcServerException, IOException, InterruptedException {
        

        is allowed to throw IOException, so your HDFS-5322 solution work well.

        For HDFS-6475, the involved class UserProvider is not allowed to throw IOException. In fact,
        UserProvider is only throwing unchecked exception, e.g., SecurityException
        here to include the StandbyException info in the message and cause:

        /** Inject user information to http operations. */
        @Provider
        public class UserProvider
            extends AbstractHttpContextInjectable<UserGroupInformation>
            implements InjectableProvider<Context, Type> {
          @Context HttpServletRequest request;
          @Context ServletContext servletcontext;
           ......
          @Override
          public UserGroupInformation getValue(final HttpContext context) {
            final Configuration conf = (Configuration) servletcontext
                .getAttribute(JspHelper.CURRENT_CONF);
            try {
              return JspHelper.getUGI(servletcontext, request, conf,
                  AuthenticationMethod.KERBEROS, false);
            } catch (IOException e) {
              throw new SecurityException(
                  "Failed to obtain user group information: " + e, e);
            }
          }
        

        This means we can't throw StandbyException (inherited from IOException) from here.
        So my uploaded patch tried to parse the message string of the SecurityException thrown
        here.

        UserProvider class inherits from classes of jersey package, which we won't be able
        to change the interface spec.

        We might be able to change the client/server interface: we detect this kind of case
        at the interface, then instead of throwing the RemoteException that wraps SecurityException,
        we throw RemoteException that wraps the cause of StandbyException. I'n not sure whether
        we should go this route though.

        Would you please comment again? thanks.

        Show
        Yongjun Zhang added a comment - Hi Jing, Thanks a lot for the info! I took a quick look, the issue is similar but seems there is an important difference here. That is, In HDFS-5322 fix, the method (and all caller hierarchy) private void saslProcess(RpcSaslProto saslMessage) throws WrappedRpcServerException, IOException, InterruptedException { is allowed to throw IOException, so your HDFS-5322 solution work well. For HDFS-6475 , the involved class UserProvider is not allowed to throw IOException. In fact, UserProvider is only throwing unchecked exception, e.g., SecurityException here to include the StandbyException info in the message and cause: /** Inject user information to http operations. */ @Provider public class UserProvider extends AbstractHttpContextInjectable<UserGroupInformation> implements InjectableProvider<Context, Type> { @Context HttpServletRequest request; @Context ServletContext servletcontext; ...... @Override public UserGroupInformation getValue( final HttpContext context) { final Configuration conf = (Configuration) servletcontext .getAttribute(JspHelper.CURRENT_CONF); try { return JspHelper.getUGI(servletcontext, request, conf, AuthenticationMethod.KERBEROS, false ); } catch (IOException e) { throw new SecurityException( "Failed to obtain user group information: " + e, e); } } This means we can't throw StandbyException (inherited from IOException) from here. So my uploaded patch tried to parse the message string of the SecurityException thrown here. UserProvider class inherits from classes of jersey package, which we won't be able to change the interface spec. We might be able to change the client/server interface: we detect this kind of case at the interface, then instead of throwing the RemoteException that wraps SecurityException, we throw RemoteException that wraps the cause of StandbyException. I'n not sure whether we should go this route though. Would you please comment again? thanks.
        Hide
        Jing Zhao added a comment -

        Looks like the similar issue on the RPC side has been fixed in HDFS-5322. Maybe we can follow the same pattern to fix this (see Server#getTrueCause).

        Show
        Jing Zhao added a comment - Looks like the similar issue on the RPC side has been fixed in HDFS-5322 . Maybe we can follow the same pattern to fix this (see Server#getTrueCause).
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12647850/HDFS-6475.001.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. There were no new javadoc warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7026//testReport/
        Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7026//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12647850/HDFS-6475.001.patch against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7026//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7026//console This message is automatically generated.
        Hide
        Yongjun Zhang added a comment -

        Upload a simple patch, hope I can get advice for a better approach.
        Thanks.

        Show
        Yongjun Zhang added a comment - Upload a simple patch, hope I can get advice for a better approach. Thanks.

          People

          • Assignee:
            Yongjun Zhang
            Reporter:
            Yongjun Zhang
          • Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development