ZooKeeper
  1. ZooKeeper
  2. ZOOKEEPER-1437

Client uses session before SASL authentication complete

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 3.4.3
    • Fix Version/s: 3.4.4, 3.5.0
    • Component/s: java client
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Found issue in the context of hbase region server startup, but can be reproduced w/ zkCli alone.

      getData may occur prior to SaslAuthenticated and fail with NoAuth. This is not expected behavior when the client is configured to use SASL.

      1. ZOOKEEPER-1437.patch
        38 kB
        Eugene Koontz
      2. ZOOKEEPER-1437.patch
        37 kB
        Eugene Koontz
      3. getXidCallHierarchy.png
        156 kB
        Eugene Koontz
      4. ZOOKEEPER-1437.patch
        37 kB
        Eugene Koontz
      5. ZOOKEEPER-1437.patch
        37 kB
        Eugene Koontz
      6. ZOOKEEPER-1437.patch
        33 kB
        Eugene Koontz
      7. ZOOKEEPER-1437.patch
        30 kB
        Eugene Koontz
      8. ZOOKEEPER-1437.patch
        30 kB
        Eugene Koontz
      9. ZOOKEEPER-1437.patch
        9 kB
        Eugene Koontz
      10. ZOOKEEPER-1437.patch
        9 kB
        Eugene Koontz
      11. ZOOKEEPER-1437.patch
        18 kB
        Eugene Koontz
      12. ZOOKEEPER-1437.patch
        16 kB
        Eugene Koontz
      13. ZOOKEEPER-1437.patch
        18 kB
        Eugene Koontz
      14. ZOOKEEPER-1437.patch
        13 kB
        Eugene Koontz
      15. ZOOKEEPER-1437.patch
        14 kB
        Eugene Koontz
      16. ZOOKEEPER-1437.patch
        14 kB
        Eugene Koontz
      17. ZOOKEEPER-1437.patch
        15 kB
        Eugene Koontz
      18. ZOOKEEPER-1437.patch
        12 kB
        Eugene Koontz

        Issue Links

          Activity

          Hide
          Eugene Koontz added a comment -

          Hi Jordan,

          What version of Java are you using to run the Java client? If it's Java 7, your problem in fact might be ZOOKEEPER-1550.

          -Eugene

          Show
          Eugene Koontz added a comment - Hi Jordan, What version of Java are you using to run the Java client? If it's Java 7, your problem in fact might be ZOOKEEPER-1550 . -Eugene
          Hide
          Jordan Zimmerman added a comment -

          I'm still seeing this in a 3.4.4 client connecting to a 3.3.5 ensemble:

          2012-10-14 20:31:34,814 INFO org.apache.zookeeper.ZooKeeper:1 [http-0.0.0.0-7101-5] [log] Initiating client connection, connectString=localhost:2181 sessionTimeout=300000 watcher=com.netflix.curator.ConnectionState@4715dc3e
          2012-10-14 20:31:34,825 INFO org.apache.zookeeper.ClientCnxn:1 [http-0.0.0.0-7101-5-SendThread(localhost:2181)] [log] Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
          2012-10-14 20:31:34,835 INFO org.apache.zookeeper.ClientCnxn:1 [http-0.0.0.0-7101-5-SendThread(localhost:2181)] [log] Socket connection established to localhost/127.0.0.1:2181, initiating session
          2012-10-14 20:31:34,835 DEBUG org.apache.zookeeper.ClientCnxn:1 [http-0.0.0.0-7101-5-SendThread(localhost:2181)] [log] Session establishment request sent on localhost/127.0.0.1:2181
          2012-10-14 20:31:34,858 WARN org.apache.zookeeper.ClientCnxnSocket:1 [http-0.0.0.0-7101-5-SendThread(localhost:2181)] [log] Connected to an old server; r-o mode will be unavailable
          2012-10-14 20:31:34,858 INFO org.apache.zookeeper.ClientCnxn:1 [http-0.0.0.0-7101-5-SendThread(localhost:2181)] [log] Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x13a60f640480000, negotiated timeout = 40000
          2012-10-14 20:31:35,379 DEBUG org.apache.zookeeper.ClientCnxnSocketNIO:1 [http-0.0.0.0-7101-5-SendThread(localhost:2181)] [log] deferring non-priming packet: clientPath:null serverPath:null finished:false header:: 0,12 replyHeader:: 0,0,0 request:: '/,F response:: v{} until SASL authentication completes.
          2012-10-14 20:31:35,864 DEBUG org.apache.zookeeper.ClientCnxnSocketNIO:1 [http-0.0.0.0-7101-5-SendThread(localhost:2181)] [log] deferring non-priming packet: clientPath:null serverPath:null finished:false header:: 0,12 replyHeader:: 0,0,0 request:: '/,F response:: v{} until SASL authentication completes.
          2012-10-14 20:31:35,864 DEBUG org.apache.zookeeper.ClientCnxnSocketNIO:1 [http-0.0.0.0-7101-5-SendThread(localhost:2181)] [log] deferring non-priming packet: clientPath:null serverPath:null finished:false header:: 0,12 replyHeader:: 0,0,0 request:: '/,F response:: v{} until SASL authentication completes.
          2012-10-14 20:31:48,166 DEBUG org.apache.zookeeper.ClientCnxnSocketNIO:1 [http-0.0.0.0-7101-5-SendThread(localhost:2181)] [log] deferring non-priming packet: clientPath:null serverPath:null finished:false header:: 0,12 replyHeader:: 0,0,0 request:: '/,F response:: v{} until SASL authentication completes.
          2012-10-14 20:31:48,167 DEBUG org.apache.zookeeper.ClientCnxnSocketNIO:1 [http-0.0.0.0-7101-5-SendThread(localhost:2181)] [log] deferring non-priming packet: clientPath:null serverPath:null finished:false header:: 0,12 replyHeader:: 0,0,0 request:: '/,F response:: v{} until SASL authentication completes.
          2012-10-14 20:31:48,168 DEBUG org.apache.zookeeper.ClientCnxnSocketNIO:1 [http-0.0.0.0-7101-5-SendThread(localhost:2181)] [log] deferring non-priming packet: clientPath:null serverPath:null finished:false header:: 0,12 replyHeader:: 0,0,0 request:: '/,F response:: v{} until SASL authentication completes.
          2012-10-14 20:31:48,168 DEBUG org.apache.zookeeper.ClientCnxnSocketNIO:1 [http-0.0.0.0-7101-5-SendThread(localhost:2181)] [log] deferring non-priming packet: clientPath:null serverPath:null finished:false header:: 0,12 replyHeader:: 0,0,0 request:: '/,F response:: v{} until SASL authentication completes.
          2012-10-14 20:31:48,169 DEBUG org.apache.zookeeper.ClientCnxnSocketNIO:1 [http-0.0.0.0-7101-5-SendThread(localhost:2181)] [log] deferring non-priming packet: clientPath:null serverPath:null finished:false header:: 0,12 replyHeader:: 0,0,0 request:: '/,F response:: v{} until SASL authentication completes.
          2012-10-14 20:31:48,169 DEBUG org.apache.zookeeper.ClientCnxnSocketNIO:1 [http-0.0.0.0-7101-5-SendThread(localhost:2181)] [log] deferring non-priming packet: clientPath:null serverPath:null finished:false header:: 0,12 replyHeader:: 0,0,0 request:: '/,F response:: v{} until SASL authentication completes.
          2012-10-14 20:31:48,169 DEBUG org.apache.zookeeper.ClientCnxnSocketNIO:1 [http-0.0.0.0-7101-5-SendThread(localhost:2181)] [log] deferring non-priming packet: clientPath:null serverPath:null finished:false header:: -2,11 replyHeader:: null request:: null response:: nulluntil SASL authentication completes.

          Show
          Jordan Zimmerman added a comment - I'm still seeing this in a 3.4.4 client connecting to a 3.3.5 ensemble: 2012-10-14 20:31:34,814 INFO org.apache.zookeeper.ZooKeeper:1 [http-0.0.0.0-7101-5] [log] Initiating client connection, connectString=localhost:2181 sessionTimeout=300000 watcher=com.netflix.curator.ConnectionState@4715dc3e 2012-10-14 20:31:34,825 INFO org.apache.zookeeper.ClientCnxn:1 [http-0.0.0.0-7101-5-SendThread(localhost:2181)] [log] Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error) 2012-10-14 20:31:34,835 INFO org.apache.zookeeper.ClientCnxn:1 [http-0.0.0.0-7101-5-SendThread(localhost:2181)] [log] Socket connection established to localhost/127.0.0.1:2181, initiating session 2012-10-14 20:31:34,835 DEBUG org.apache.zookeeper.ClientCnxn:1 [http-0.0.0.0-7101-5-SendThread(localhost:2181)] [log] Session establishment request sent on localhost/127.0.0.1:2181 2012-10-14 20:31:34,858 WARN org.apache.zookeeper.ClientCnxnSocket:1 [http-0.0.0.0-7101-5-SendThread(localhost:2181)] [log] Connected to an old server; r-o mode will be unavailable 2012-10-14 20:31:34,858 INFO org.apache.zookeeper.ClientCnxn:1 [http-0.0.0.0-7101-5-SendThread(localhost:2181)] [log] Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x13a60f640480000, negotiated timeout = 40000 2012-10-14 20:31:35,379 DEBUG org.apache.zookeeper.ClientCnxnSocketNIO:1 [http-0.0.0.0-7101-5-SendThread(localhost:2181)] [log] deferring non-priming packet: clientPath:null serverPath:null finished:false header:: 0,12 replyHeader:: 0,0,0 request:: '/,F response:: v{} until SASL authentication completes. 2012-10-14 20:31:35,864 DEBUG org.apache.zookeeper.ClientCnxnSocketNIO:1 [http-0.0.0.0-7101-5-SendThread(localhost:2181)] [log] deferring non-priming packet: clientPath:null serverPath:null finished:false header:: 0,12 replyHeader:: 0,0,0 request:: '/,F response:: v{} until SASL authentication completes. 2012-10-14 20:31:35,864 DEBUG org.apache.zookeeper.ClientCnxnSocketNIO:1 [http-0.0.0.0-7101-5-SendThread(localhost:2181)] [log] deferring non-priming packet: clientPath:null serverPath:null finished:false header:: 0,12 replyHeader:: 0,0,0 request:: '/,F response:: v{} until SASL authentication completes. 2012-10-14 20:31:48,166 DEBUG org.apache.zookeeper.ClientCnxnSocketNIO:1 [http-0.0.0.0-7101-5-SendThread(localhost:2181)] [log] deferring non-priming packet: clientPath:null serverPath:null finished:false header:: 0,12 replyHeader:: 0,0,0 request:: '/,F response:: v{} until SASL authentication completes. 2012-10-14 20:31:48,167 DEBUG org.apache.zookeeper.ClientCnxnSocketNIO:1 [http-0.0.0.0-7101-5-SendThread(localhost:2181)] [log] deferring non-priming packet: clientPath:null serverPath:null finished:false header:: 0,12 replyHeader:: 0,0,0 request:: '/,F response:: v{} until SASL authentication completes. 2012-10-14 20:31:48,168 DEBUG org.apache.zookeeper.ClientCnxnSocketNIO:1 [http-0.0.0.0-7101-5-SendThread(localhost:2181)] [log] deferring non-priming packet: clientPath:null serverPath:null finished:false header:: 0,12 replyHeader:: 0,0,0 request:: '/,F response:: v{} until SASL authentication completes. 2012-10-14 20:31:48,168 DEBUG org.apache.zookeeper.ClientCnxnSocketNIO:1 [http-0.0.0.0-7101-5-SendThread(localhost:2181)] [log] deferring non-priming packet: clientPath:null serverPath:null finished:false header:: 0,12 replyHeader:: 0,0,0 request:: '/,F response:: v{} until SASL authentication completes. 2012-10-14 20:31:48,169 DEBUG org.apache.zookeeper.ClientCnxnSocketNIO:1 [http-0.0.0.0-7101-5-SendThread(localhost:2181)] [log] deferring non-priming packet: clientPath:null serverPath:null finished:false header:: 0,12 replyHeader:: 0,0,0 request:: '/,F response:: v{} until SASL authentication completes. 2012-10-14 20:31:48,169 DEBUG org.apache.zookeeper.ClientCnxnSocketNIO:1 [http-0.0.0.0-7101-5-SendThread(localhost:2181)] [log] deferring non-priming packet: clientPath:null serverPath:null finished:false header:: 0,12 replyHeader:: 0,0,0 request:: '/,F response:: v{} until SASL authentication completes. 2012-10-14 20:31:48,169 DEBUG org.apache.zookeeper.ClientCnxnSocketNIO:1 [http-0.0.0.0-7101-5-SendThread(localhost:2181)] [log] deferring non-priming packet: clientPath:null serverPath:null finished:false header:: -2,11 replyHeader:: null request:: null response:: nulluntil SASL authentication completes.
          Hide
          Hudson added a comment -

          Integrated in ZooKeeper-trunk #1675 (See https://builds.apache.org/job/ZooKeeper-trunk/1675/)
          ZOOKEEPER-1437. Client uses session before SASL authentication complete (Eugene Koontz via mahadev) (Revision 1382555)

          Result = SUCCESS
          mahadev : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1382555
          Files :

          • /zookeeper/trunk/CHANGES.txt
          • /zookeeper/trunk/src/java/main/org/apache/zookeeper/ClientCnxn.java
          • /zookeeper/trunk/src/java/main/org/apache/zookeeper/ClientCnxnSocket.java
          • /zookeeper/trunk/src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java
          • /zookeeper/trunk/src/java/main/org/apache/zookeeper/client/ZooKeeperSaslClient.java
          • /zookeeper/trunk/src/java/test/org/apache/zookeeper/test/SaslAuthDesignatedClientTest.java
          • /zookeeper/trunk/src/java/test/org/apache/zookeeper/test/SaslAuthFailDesignatedClientTest.java
          • /zookeeper/trunk/src/java/test/org/apache/zookeeper/test/SaslAuthFailTest.java
          • /zookeeper/trunk/src/java/test/org/apache/zookeeper/test/SaslAuthMissingClientConfigTest.java
          • /zookeeper/trunk/src/java/test/org/apache/zookeeper/test/SaslAuthTest.java
          Show
          Hudson added a comment - Integrated in ZooKeeper-trunk #1675 (See https://builds.apache.org/job/ZooKeeper-trunk/1675/ ) ZOOKEEPER-1437 . Client uses session before SASL authentication complete (Eugene Koontz via mahadev) (Revision 1382555) Result = SUCCESS mahadev : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1382555 Files : /zookeeper/trunk/CHANGES.txt /zookeeper/trunk/src/java/main/org/apache/zookeeper/ClientCnxn.java /zookeeper/trunk/src/java/main/org/apache/zookeeper/ClientCnxnSocket.java /zookeeper/trunk/src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java /zookeeper/trunk/src/java/main/org/apache/zookeeper/client/ZooKeeperSaslClient.java /zookeeper/trunk/src/java/test/org/apache/zookeeper/test/SaslAuthDesignatedClientTest.java /zookeeper/trunk/src/java/test/org/apache/zookeeper/test/SaslAuthFailDesignatedClientTest.java /zookeeper/trunk/src/java/test/org/apache/zookeeper/test/SaslAuthFailTest.java /zookeeper/trunk/src/java/test/org/apache/zookeeper/test/SaslAuthMissingClientConfigTest.java /zookeeper/trunk/src/java/test/org/apache/zookeeper/test/SaslAuthTest.java
          Hide
          Mahadev konar added a comment -

          I just committed this. This is a big one. Thanks Eugene and others who helped reviewing it.

          Show
          Mahadev konar added a comment - I just committed this. This is a big one. Thanks Eugene and others who helped reviewing it.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12543934/ZOOKEEPER-1437.patch
          against trunk revision 1380931.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 15 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1179//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1179//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1179//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12543934/ZOOKEEPER-1437.patch against trunk revision 1380931. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 15 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1179//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1179//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1179//console This message is automatically generated.
          Hide
          Eugene Koontz added a comment -

          Fixes SaslAuthDesignatedServerTest by checking for both system property JAAS_CONF_KEY and java.security.auth.login.Configuration.getConfiguration().

          Show
          Eugene Koontz added a comment - Fixes SaslAuthDesignatedServerTest by checking for both system property JAAS_CONF_KEY and java.security.auth.login.Configuration.getConfiguration().
          Hide
          Mahadev konar added a comment -

          Thanks Eugene!

          Show
          Mahadev konar added a comment - Thanks Eugene!
          Hide
          Eugene Koontz added a comment -

          I see what's going on: my patch is looking for the system property "java.security.auth.login.config" variable, which is not defined by the test that failed (SaslAuthDesignatedServerTest).

          Will have a new patch up very soon.

          Show
          Eugene Koontz added a comment - I see what's going on: my patch is looking for the system property "java.security.auth.login.config" variable, which is not defined by the test that failed (SaslAuthDesignatedServerTest). Will have a new patch up very soon.
          Hide
          Mahadev konar added a comment -

          Eugene,
          Looks like there is a one last glitch - the test failed. Can you please take a look? I think the patch is ready to go (except for the test case).

          Show
          Mahadev konar added a comment - Eugene, Looks like there is a one last glitch - the test failed. Can you please take a look? I think the patch is ready to go (except for the test case).
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12543740/ZOOKEEPER-1437.patch
          against trunk revision 1380130.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 15 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1176//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1176//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1176//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12543740/ZOOKEEPER-1437.patch against trunk revision 1380130. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 15 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1176//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1176//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1176//console This message is automatically generated.
          Hide
          Eugene Koontz added a comment -

          Ben, You are right: a new ZooKeeperSaslClient will be created for each new connection, since it's a member variable of a ClientCnxn. So it doesn't seem possible for a ZooKeeperSaslClient to retry across two subsequent connections.

          Show
          Eugene Koontz added a comment - Ben, You are right: a new ZooKeeperSaslClient will be created for each new connection, since it's a member variable of a ClientCnxn. So it doesn't seem possible for a ZooKeeperSaslClient to retry across two subsequent connections.
          Hide
          Eugene Koontz added a comment -

          Merge with trunk and pass along exception as second argument of SaslException. Thanks to Rakesh for pointing it out.

          Show
          Eugene Koontz added a comment - Merge with trunk and pass along exception as second argument of SaslException. Thanks to Rakesh for pointing it out.
          Hide
          Eugene Koontz added a comment -

          Sorry, meant to link to ZOOKEEPER-1547.

          Show
          Eugene Koontz added a comment - Sorry, meant to link to ZOOKEEPER-1547 .
          Hide
          Eugene Koontz added a comment -

          Hi Rakesh and Ben,

          I've added https://issues.apache.org/jira/browse/ZOOKEEPER-1437 based on your comments. I addressed it specifically to SASL, though perhaps it is a more general issue of testing the client's coping with all kinds of dropped packets, not just SASL ones.

          Show
          Eugene Koontz added a comment - Hi Rakesh and Ben, I've added https://issues.apache.org/jira/browse/ZOOKEEPER-1437 based on your comments. I addressed it specifically to SASL, though perhaps it is a more general issue of testing the client's coping with all kinds of dropped packets, not just SASL ones.
          Hide
          Benjamin Reed added a comment -

          +1 looks great for the next release. it would be nice to address the exceptions that Rakesh brought up and set the causing exception as the cause rather than embedding in the message.

          as for the disconnect problem. it is slightly hard to follow, but if there is a network failure, wont the connection reset and a new ZooKeeperSaslClient be created anyway? it would be great to add a test that simply drops responses to clients without closing connections. i'm fine with doing it in a separate patch.

          Show
          Benjamin Reed added a comment - +1 looks great for the next release. it would be nice to address the exceptions that Rakesh brought up and set the causing exception as the cause rather than embedding in the message. as for the disconnect problem. it is slightly hard to follow, but if there is a network failure, wont the connection reset and a new ZooKeeperSaslClient be created anyway? it would be great to add a test that simply drops responses to clients without closing connections. i'm fine with doing it in a separate patch.
          Hide
          Rakesh R added a comment -

          Yeah Eugene, retry logic is fine.

          I think it would be good to test the SASL exchange in the presence of network failures - I'm wondering what support there is already in Zookeeper tests for network failure and if so, can we use them to test the SASL exchange.

          I could see DisconnectableZooKeeper.disconnect() has network delays/partition simulation logic. Hope this will help you to get more idea.

          Show
          Rakesh R added a comment - Yeah Eugene, retry logic is fine. I think it would be good to test the SASL exchange in the presence of network failures - I'm wondering what support there is already in Zookeeper tests for network failure and if so, can we use them to test the SASL exchange. I could see DisconnectableZooKeeper.disconnect() has network delays/partition simulation logic. Hope this will help you to get more idea.
          Hide
          Eugene Koontz added a comment -

          Hi Rakesh,

          Sorry to take a long time to respond to you. I'll try to be more prompt especially as folks are hoping to get this in a release soon.

          I see what you are asking: should we retry if there's a network error connecting to the client (in your terms, a partition)?

          I guess we could add some retry logic - something like:

          while (!(saslClient.isComplete()) {
             try {
               sendSaslPacket(saslToken, cnxn); // throws SaslException, IOException
             } catch (IOException ne) {
               // transitory network failure: keep trying to reach server
             } catch (SaslException se) {
               // auth failed: nothing more we can do.
               break;
             }
          }
          

          Is that what you had in mind?

          I think it would be good to test the SASL exchange in the presence of network failures - I'm wondering what support there is already in Zookeeper tests for network failure and if so, can we use them to test the SASL exchange.

          Show
          Eugene Koontz added a comment - Hi Rakesh, Sorry to take a long time to respond to you. I'll try to be more prompt especially as folks are hoping to get this in a release soon. I see what you are asking: should we retry if there's a network error connecting to the client (in your terms, a partition)? I guess we could add some retry logic - something like: while (!(saslClient.isComplete()) { try { sendSaslPacket(saslToken, cnxn); // throws SaslException, IOException } catch (IOException ne) { // transitory network failure: keep trying to reach server } catch (SaslException se) { // auth failed: nothing more we can do . break ; } } Is that what you had in mind? I think it would be good to test the SASL exchange in the presence of network failures - I'm wondering what support there is already in Zookeeper tests for network failure and if so, can we use them to test the SASL exchange.
          Hide
          Rakesh R added a comment -

          Hi Eugene, Sorry to join late. BTW the patch looks great and just few suggestions.


          #1#

          throw new SaslException("Failed to send SASL packet to server due " +
                        "to IOException:" + e);
          LOG.error("SASL authentication with Zookeeper Quorum member failed: " + e);
          

          Instead of appending the exception, would be good to pass exception as an argument.


          #2# Say, client and server got partitioned. Now the 'respondToServer' will fail and marking the state as FAILED, also 'gotLastPacket = true'. Here I'm just doubting, whether to mark 'gotLastPacket=true' rather what about considering as authenticationInProgress and continue?

          public void respondToServer(byte[] serverToken, ClientCnxn cnxn) {
                if (!(saslClient.isComplete())) {
                   try {
                   //...
                        sendSaslPacket(saslToken, cnxn);
                   //....
                   } catch (SaslException e) {
                       saslState = SaslState.FAILED;
                       gotLastPacket = true;
                   }
          }
          
          public boolean clientTunneledAuthenticationInProgress() {
                    //....
                    //...
                      if (((isComplete()) ||
                              (isFailed())) &&
                              (gotLastPacket == false)) {
                          return true;
                      }
          }
          
          Show
          Rakesh R added a comment - Hi Eugene, Sorry to join late. BTW the patch looks great and just few suggestions. #1# throw new SaslException( "Failed to send SASL packet to server due " + "to IOException:" + e); LOG.error( "SASL authentication with Zookeeper Quorum member failed: " + e); Instead of appending the exception, would be good to pass exception as an argument. #2# Say, client and server got partitioned. Now the 'respondToServer' will fail and marking the state as FAILED, also 'gotLastPacket = true'. Here I'm just doubting, whether to mark 'gotLastPacket=true' rather what about considering as authenticationInProgress and continue? public void respondToServer( byte [] serverToken, ClientCnxn cnxn) { if (!(saslClient.isComplete())) { try { //... sendSaslPacket(saslToken, cnxn); //.... } catch (SaslException e) { saslState = SaslState.FAILED; gotLastPacket = true ; } } public boolean clientTunneledAuthenticationInProgress() { //.... //... if (((isComplete()) || (isFailed())) && (gotLastPacket == false )) { return true ; } }
          Hide
          Eugene Koontz added a comment -

          Hi Mahadev,
          Thanks for looking at the patch and your comments. Responding to your comments:

          You added

          public boolean readOnly;

          In ClientCnxn.java Packet class which doesnt seem to be used anywhere, am I correct?

          This was added so that packet-serialization can be deferred until the
          packet is actually sent: we need to save the readOnly parameter,
          passed as an argument to the packet constructor, until it is actually
          used in the serialization of the packet. In this patch this
          serialization is now done in a newly-added method,
          ClientCnxnSocketNIO::sendPacket(), which serializes the input packet
          'p' by calling p.createBB(). createBB() uses the packet's 'readOnly'
          member variable in order to create the serialization of the packet.

          I think we are a little weak on our synchronizations in the

          patch. I will take a look again but looks like its using a lot of

          member variables which could get changed by various threads.

          I think there is a race condition in

          ClientCnxnSocketNIO:findSendablePacket() wherein how do you make

          sure that the sasl packets (without header) are present in the

          queue before we start running the thread.

          I don't think there is a race condition possible. My reasoning is that
          the sendThread both sends packets from the outgoingQueue and manages
          the zooKeeperSaslClient object. Therefore it is not possible to send
          packets from the outgoing queue (besides the initial priming packet) without the zooKeeperSaslClient being
          finished with authentication (whether successfully or not).

          In the code from the patch that you
          cite, clientTunneledAuthenticationInProgress() must have returned
          false in order to get inside the {{ else{} }}. For
          clientTunneledAuthenticationInProgress() is false, then at least one
          of the following must be true:

          1. saslAuthFailed is true, or
          2. zooKeeperSaslClient's clientTunneledAuthenticationInProgress() is false.

          With regard to 1., saslAuthFailed can only be true if a LoginException was caught within
          startConnect(), but startConnect() can only be run by the sendThread.

          With regard to 2., this can only be false if gotLastPacket is true,
          and either:

          2.1. saslState is COMPLETE, or
          2.2. saslState is FAILED.

          But gotLastPacket is initialized to false when the sendThread creates it,
          and can only be set to true by code run by the sendThread
          (specifically by ZooKeeperSaslClient::respondToServer()).

          Show
          Eugene Koontz added a comment - Hi Mahadev, Thanks for looking at the patch and your comments. Responding to your comments: You added public boolean readOnly; In ClientCnxn.java Packet class which doesnt seem to be used anywhere, am I correct? This was added so that packet-serialization can be deferred until the packet is actually sent: we need to save the readOnly parameter, passed as an argument to the packet constructor, until it is actually used in the serialization of the packet. In this patch this serialization is now done in a newly-added method, ClientCnxnSocketNIO::sendPacket(), which serializes the input packet 'p' by calling p.createBB(). createBB() uses the packet's 'readOnly' member variable in order to create the serialization of the packet. I think we are a little weak on our synchronizations in the patch. I will take a look again but looks like its using a lot of member variables which could get changed by various threads. I think there is a race condition in ClientCnxnSocketNIO:findSendablePacket() wherein how do you make sure that the sasl packets (without header) are present in the queue before we start running the thread. I don't think there is a race condition possible. My reasoning is that the sendThread both sends packets from the outgoingQueue and manages the zooKeeperSaslClient object. Therefore it is not possible to send packets from the outgoing queue (besides the initial priming packet) without the zooKeeperSaslClient being finished with authentication (whether successfully or not). In the code from the patch that you cite, clientTunneledAuthenticationInProgress() must have returned false in order to get inside the {{ else{} }}. For clientTunneledAuthenticationInProgress() is false, then at least one of the following must be true: 1. saslAuthFailed is true, or 2. zooKeeperSaslClient's clientTunneledAuthenticationInProgress() is false. With regard to 1., saslAuthFailed can only be true if a LoginException was caught within startConnect() , but startConnect() can only be run by the sendThread. With regard to 2., this can only be false if gotLastPacket is true, and either: 2.1. saslState is COMPLETE, or 2.2. saslState is FAILED. But gotLastPacket is initialized to false when the sendThread creates it, and can only be set to true by code run by the sendThread (specifically by ZooKeeperSaslClient::respondToServer() ).
          Hide
          Mahadev konar added a comment -

          Looked at the patch. Couple of comments on it:

          • You added
             public boolean readOnly; 

            In ClientCnxn.java Packet class which doesnt seem to be used anywhere, am I correct?

          • I think we are a little weak on our synchronizations in the patch. I will take a look again but looks like its using a lot of member variables which could get changed by various threads.
          • I think there is a race condition in ClientCnxnSocketNIO:findSendablePacket() wherein how do you make sure that the sasl packets (without header) are present in the queue before we start running the thread. What I mean to say is that is it possible that we will land up with
            } else {
                              // Tunnelled authentication is not in progress: just
                              // send the first packet in the queue.
                              return outgoingQueue.getFirst();
                          }
          

          Even though we are not done authenticating via sasl?

          Show
          Mahadev konar added a comment - Looked at the patch. Couple of comments on it: You added public boolean readOnly; In ClientCnxn.java Packet class which doesnt seem to be used anywhere, am I correct? I think we are a little weak on our synchronizations in the patch. I will take a look again but looks like its using a lot of member variables which could get changed by various threads. I think there is a race condition in ClientCnxnSocketNIO:findSendablePacket() wherein how do you make sure that the sasl packets (without header) are present in the queue before we start running the thread. What I mean to say is that is it possible that we will land up with } else { // Tunnelled authentication is not in progress: just // send the first packet in the queue. return outgoingQueue.getFirst(); } Even though we are not done authenticating via sasl?
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12532977/getXidCallHierarchy.png
          against trunk revision 1351541.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1107//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12532977/getXidCallHierarchy.png against trunk revision 1351541. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1107//console This message is automatically generated.
          Hide
          Eugene Koontz added a comment -

          Thanks, Patrick. Also I've created a new review here:

          https://reviews.apache.org/r/5505/

          -Eugene

          Show
          Eugene Koontz added a comment - Thanks, Patrick. Also I've created a new review here: https://reviews.apache.org/r/5505/ -Eugene
          Hide
          Patrick Hunt added a comment -

          This (the patch I was using) was from a day or two ago - I'll pull the latest and let you know. thx.

          Show
          Patrick Hunt added a comment - This (the patch I was using) was from a day or two ago - I'll pull the latest and let you know. thx.
          Hide
          Eugene Koontz added a comment -

          Hi Patrick,

          Thanks for the information about Eclipse. I'm attaching a screenshot showing my Call Hierarchy view.

          The code that calls ServerSaslResponseCallback.processResult has been removed in the latest patch (as well as in the last few most recent patches).

          (See the section of the patch beginning with "@@ -553,17 +551,6 @@ public class ClientCnxn {" )

          So I don't think there's a possibility anymore of the EventThread creating or sending packets.

          Show
          Eugene Koontz added a comment - Hi Patrick, Thanks for the information about Eclipse. I'm attaching a screenshot showing my Call Hierarchy view. The code that calls ServerSaslResponseCallback.processResult has been removed in the latest patch (as well as in the last few most recent patches). (See the section of the patch beginning with "@@ -553,17 +551,6 @@ public class ClientCnxn {" ) So I don't think there's a possibility anymore of the EventThread creating or sending packets.
          Hide
          Eugene Koontz added a comment -

          Fix bug in last patch that was not sending SaslAuthenticated event after authentication finishes.

          Show
          Eugene Koontz added a comment - Fix bug in last patch that was not sending SaslAuthenticated event after authentication finishes.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12532965/ZOOKEEPER-1437.patch
          against trunk revision 1351541.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 15 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1106//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1106//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1106//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12532965/ZOOKEEPER-1437.patch against trunk revision 1351541. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 15 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1106//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1106//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1106//console This message is automatically generated.
          Hide
          Patrick Hunt added a comment -

          Hi Eugene, here's the call hierarchy from Eclipse (one of many) and the root of my concern:

          org.apache.zookeeper.ClientCnxn.sendPacket(Record, Record, AsyncCallback, int)
          org.apache.zookeeper.client.ZooKeeperSaslClient.sendSaslPacket(byte[], ClientCnxn)
          org.apache.zookeeper.client.ZooKeeperSaslClient.respondToServer(byte[], ClientCnxn)
          org.apache.zookeeper.client.ZooKeeperSaslClient.ServerSaslResponseCallback.processResult(int, String, Object, byte[], Stat)
          org.apache.zookeeper.ClientCnxn.EventThread.processEvent(Object)
          org.apache.zookeeper.ClientCnxn.EventThread.run()
          

          Try doing a "open call hierarchy" in eclipse for getXid.

          Show
          Patrick Hunt added a comment - Hi Eugene, here's the call hierarchy from Eclipse (one of many) and the root of my concern: org.apache.zookeeper.ClientCnxn.sendPacket(Record, Record, AsyncCallback, int) org.apache.zookeeper.client.ZooKeeperSaslClient.sendSaslPacket(byte[], ClientCnxn) org.apache.zookeeper.client.ZooKeeperSaslClient.respondToServer(byte[], ClientCnxn) org.apache.zookeeper.client.ZooKeeperSaslClient.ServerSaslResponseCallback.processResult(int, String, Object, byte[], Stat) org.apache.zookeeper.ClientCnxn.EventThread.processEvent(Object) org.apache.zookeeper.ClientCnxn.EventThread.run() Try doing a "open call hierarchy" in eclipse for getXid.
          Hide
          Eugene Koontz added a comment -

          Hi Patrick,

          Thanks as always for your comments and help.

          With regard to Xids: I don't see how the event thread can call sendPacket(). It seems to me that only the send thread can call getXid() or sendPacket(), and therefore, packets can only be sent in the correct order.

          I put some test logging in ClientCnxn::getXid():

          LOG.info("getXid() being called from thread: " + Thread.currentThread().getName());
          

          and the logging always reports SendThread.

          With regard to spinning on the write interest - you are right, this was happening. I've created a new patch that prevents spinning in doIO(). It works by calling disableWrite() in doIO() if no packet can be sent because SASL authentication is ongoing. enableWrite(), in turn, is called after SASL authentication is completed. This makes the patch a bit more complicated to follow, I'm afraid, but I think I'm on the right track.

          I'm attaching the patch and opening a review on reviews.apache.org as you asked.

          Show
          Eugene Koontz added a comment - Hi Patrick, Thanks as always for your comments and help. With regard to Xids: I don't see how the event thread can call sendPacket(). It seems to me that only the send thread can call getXid() or sendPacket(), and therefore, packets can only be sent in the correct order. I put some test logging in ClientCnxn::getXid(): LOG.info( "getXid() being called from thread: " + Thread .currentThread().getName()); and the logging always reports SendThread. With regard to spinning on the write interest - you are right, this was happening. I've created a new patch that prevents spinning in doIO(). It works by calling disableWrite() in doIO() if no packet can be sent because SASL authentication is ongoing. enableWrite(), in turn, is called after SASL authentication is completed. This makes the patch a bit more complicated to follow, I'm afraid, but I think I'm on the right track. I'm attaching the patch and opening a review on reviews.apache.org as you asked.
          Hide
          Patrick Hunt added a comment -

          I looked through the patch, some questions.

          1) moving the xid is a bit concerning.

          Isn't there a chance that the next xid (A) will be generated, and a separate xid (B) generated, and B sent on the wire before A? Notice that getXid can be called by doio, but also by sendPacket via the event thread. Or does something else protect against this?

          2) I'm still concerned about spinning on the write interest flag. Have you considered this?

          I'm still looking, but those were my initial thoughts.

          Show
          Patrick Hunt added a comment - I looked through the patch, some questions. 1) moving the xid is a bit concerning. Isn't there a chance that the next xid (A) will be generated, and a separate xid (B) generated, and B sent on the wire before A? Notice that getXid can be called by doio, but also by sendPacket via the event thread. Or does something else protect against this? 2) I'm still concerned about spinning on the write interest flag. Have you considered this? I'm still looking, but those were my initial thoughts.
          Hide
          Patrick Hunt added a comment -

          Eugene would you mind moving it back over now? (rb) that way we can see the review updates here in the jira. Thx.

          Show
          Patrick Hunt added a comment - Eugene would you mind moving it back over now? (rb) that way we can see the review updates here in the jira. Thx.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12532777/ZOOKEEPER-1437.patch
          against trunk revision 1351541.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 15 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1102//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1102//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1102//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12532777/ZOOKEEPER-1437.patch against trunk revision 1351541. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 15 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1102//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1102//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1102//console This message is automatically generated.
          Hide
          Eugene Koontz added a comment -

          Thanks, Patrick, good to know. I'll use that in the future.

          Show
          Eugene Koontz added a comment - Thanks, Patrick, good to know. I'll use that in the future.
          Hide
          Patrick Hunt added a comment -

          btw, reviews.apache.org is back now.

          Show
          Patrick Hunt added a comment - btw, reviews.apache.org is back now.
          Hide
          Eugene Koontz added a comment -

          Latest patch fixes problem when using Kerberos as SASL mechanism, because ZK server sends a final empty SASL packet at end of authentication. Also tries to minimize changes to existing, non-SASL related files.

          Show
          Eugene Koontz added a comment - Latest patch fixes problem when using Kerberos as SASL mechanism, because ZK server sends a final empty SASL packet at end of authentication. Also tries to minimize changes to existing, non-SASL related files.
          Hide
          Eugene Koontz added a comment -

          Thanks Andy, using review.cloudera.org: https://review.cloudera.org/r/2150/

          Show
          Eugene Koontz added a comment - Thanks Andy, using review.cloudera.org: https://review.cloudera.org/r/2150/
          Hide
          Andrew Purtell added a comment -

          review.cloudera.org is available.

          Show
          Andrew Purtell added a comment - review.cloudera.org is available.
          Hide
          Eugene Koontz added a comment -

          Unfortunately I don't know when I can post a review to the reviewboard since it seems down for an indefinite period; see https://issues.apache.org/jira/browse/INFRA-4888

          Show
          Eugene Koontz added a comment - Unfortunately I don't know when I can post a review to the reviewboard since it seems down for an indefinite period; see https://issues.apache.org/jira/browse/INFRA-4888
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12531625/ZOOKEEPER-1437.patch
          against trunk revision 1337029.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 15 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1092//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1092//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1092//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12531625/ZOOKEEPER-1437.patch against trunk revision 1337029. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 15 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1092//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1092//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1092//console This message is automatically generated.
          Hide
          Eugene Koontz added a comment -

          Fix findbugs warning introduced in last patch.

          Show
          Eugene Koontz added a comment - Fix findbugs warning introduced in last patch.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12531621/ZOOKEEPER-1437.patch
          against trunk revision 1337029.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 15 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1091//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1091//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1091//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12531621/ZOOKEEPER-1437.patch against trunk revision 1337029. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 15 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1091//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1091//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1091//console This message is automatically generated.
          Hide
          Eugene Koontz added a comment -

          https://reviews.apache.org/ is down for maintenance - will post review when it's back up.

          Show
          Eugene Koontz added a comment - https://reviews.apache.org/ is down for maintenance - will post review when it's back up.
          Hide
          Eugene Koontz added a comment -

          Hi Patrick,
          Thanks for your patience and help with this. I am attaching a new patch and also creating a review board review for this patch.

          In this most recent patch, I am generating the SASL packets on the fly as you mentioned, and sending these immediately rather than queuing them.

          Also, there is no synchronization mechanisms (no CountDownLatch) added in this patch. This eliminates the blocking in queueing that you pointed out in your last comment.

          I've moved the Xid-setting of the packets to immediately before they are sent (whether they are SASL packets or queued packets), in order to make sure that the packets' Xids are always in order of their being sent.

          Thank you Himanshu for your testing - I believe my latest patch will prevent packets with out-of-order Xids that you experienced.

          Show
          Eugene Koontz added a comment - Hi Patrick, Thanks for your patience and help with this. I am attaching a new patch and also creating a review board review for this patch. In this most recent patch, I am generating the SASL packets on the fly as you mentioned, and sending these immediately rather than queuing them. Also, there is no synchronization mechanisms (no CountDownLatch) added in this patch. This eliminates the blocking in queueing that you pointed out in your last comment. I've moved the Xid-setting of the packets to immediately before they are sent (whether they are SASL packets or queued packets), in order to make sure that the packets' Xids are always in order of their being sent. Thank you Himanshu for your testing - I believe my latest patch will prevent packets with out-of-order Xids that you experienced.
          Hide
          Patrick Hunt added a comment -

          Sorry Eugene but I think I'm making a hash of explaining myself to you.

          queuePacket needs to be a non-blocking call. This latest patch makes it potentially blocking and that will likely break some things for end users.

          see my earlier comment:

          I'm wondering why approach it this way rather than checking if we are in the auth phase in "doIO" and then only sending auth packets until the auth phase has been finalized. (leave any non-auth packets in the queue)

          I think what we should be doing is queuing regular the packets as usual, however the "doIO" thread will not forward these packets to the server until the SASL auth phase is completed. SASL packets on the other hand should not be queued but generated on the fly based on the connection state (generated in doIO based on this state) Your patch from May 7 was pretty close - except for the issues I raised on May 9.

          Do you see my point wrt this earlier feedback? If not please lmk and I'll try to clarify specific points.

          Show
          Patrick Hunt added a comment - Sorry Eugene but I think I'm making a hash of explaining myself to you. queuePacket needs to be a non-blocking call. This latest patch makes it potentially blocking and that will likely break some things for end users. see my earlier comment: I'm wondering why approach it this way rather than checking if we are in the auth phase in "doIO" and then only sending auth packets until the auth phase has been finalized. (leave any non-auth packets in the queue) I think what we should be doing is queuing regular the packets as usual, however the "doIO" thread will not forward these packets to the server until the SASL auth phase is completed. SASL packets on the other hand should not be queued but generated on the fly based on the connection state (generated in doIO based on this state) Your patch from May 7 was pretty close - except for the issues I raised on May 9. Do you see my point wrt this earlier feedback? If not please lmk and I'll try to clarify specific points.
          Hide
          Himanshu Vashishtha added a comment -

          In case of cross realm authentication, I once got this exception:

          2012-05-19 20:19:43,776 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server /XXX.XX.XX.XX:2181
          2012-05-19 20:19:43,776 INFO org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: The identifier of th
          is process is 3544@<HIDDEN>.com
          2012-05-19 20:19:43,776 INFO org.apache.zookeeper.client.ZooKeeperSaslClient: Found Login Context section
           'Client': will use it to attempt to SASL-authenticate.
          2012-05-19 20:19:43,776 INFO org.apache.zookeeper.client.ZooKeeperSaslClient: Client will use GSSAPI as S
          ASL mechanism.
          2012-05-19 20:19:43,777 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to <hidden>.com XXX.XX.XXX.XX:2181, initiating session
          2012-05-19 20:19:43,780 INFO org.apache.hadoop.hbase.replication.ReplicationZookeeper: Added new peer clu
          ster 22
          2012-05-19 20:19:43,783 INFO org.apache.hadoop.hbase.metrics: new MBeanInfo
          2012-05-19 20:19:43,886 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server XXX.XX.XXX.X:2181, sessionid = 0x137683fb0680000, negotiated timeout = 40000
          2012-05-19 20:19:43,903 WARN org.apache.zookeeper.ClientCnxn: Session 0x137683fb0680000 for server  XXX.XX.XXX.X:1:2181, unexpected error, closing socket connection and attempting reconnect
          java.io.IOException: Xid out of order. Got Xid 2 with err 0 expected Xid 1 for a packet with details: cli
          entPath:null serverPath:null finished:false header:: 1,8  replyHeader:: 0,0,-4  request:: '/hbase/rs,F  r
          esponse:: v{}
                  at org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:796)
                  at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:89)
                  at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:291)
                  at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1039)
          2012-05-19 20:19:43,903 WARN org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient Z
          ooKeeper exception: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = Conne
          ctionLoss for /hbase/rs
          2012-05-19 20:19:43,903 INFO org.apache.hadoop.hbase.util.RetryCounter: The 1 times to retry  after sleep
          ing 2000 ms
          2012-05-19 20:19:45,125 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server  XXX.XX.XXX.X:2181
          2012-05-19 20:19:45,151 INFO org.apache.zookeeper.client.ZooKeeperSaslClient: Found Login Context section
           'Client': will use it to attempt to SASL-authenticate.
          2012-05-19 20:19:45,152 INFO org.apache.zookeeper.client.ZooKeeperSaslClient: Client will use GSSAPI as S
          ASL mechanism.
          2012-05-19 20:19:45,152 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to  XXX.XX.XXX.X:2181, initiating session
          2012-05-19 20:19:45,155 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server  XXX.XX.XXX.X:2181, sessionid = 0x137683fb0680000, negotiated timeout = 40000
          
          

          Though, this happened only once, but I am pasting it here if that might be useful... as its HBase related, and its about transaction mis-ordering.

          Show
          Himanshu Vashishtha added a comment - In case of cross realm authentication, I once got this exception: 2012-05-19 20:19:43,776 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server /XXX.XX.XX.XX:2181 2012-05-19 20:19:43,776 INFO org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: The identifier of th is process is 3544@<HIDDEN>.com 2012-05-19 20:19:43,776 INFO org.apache.zookeeper.client.ZooKeeperSaslClient: Found Login Context section 'Client': will use it to attempt to SASL-authenticate. 2012-05-19 20:19:43,776 INFO org.apache.zookeeper.client.ZooKeeperSaslClient: Client will use GSSAPI as S ASL mechanism. 2012-05-19 20:19:43,777 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to <hidden>.com XXX.XX.XXX.XX:2181, initiating session 2012-05-19 20:19:43,780 INFO org.apache.hadoop.hbase.replication.ReplicationZookeeper: Added new peer clu ster 22 2012-05-19 20:19:43,783 INFO org.apache.hadoop.hbase.metrics: new MBeanInfo 2012-05-19 20:19:43,886 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server XXX.XX.XXX.X:2181, sessionid = 0x137683fb0680000, negotiated timeout = 40000 2012-05-19 20:19:43,903 WARN org.apache.zookeeper.ClientCnxn: Session 0x137683fb0680000 for server XXX.XX.XXX.X:1:2181, unexpected error, closing socket connection and attempting reconnect java.io.IOException: Xid out of order. Got Xid 2 with err 0 expected Xid 1 for a packet with details: cli entPath: null serverPath: null finished: false header:: 1,8 replyHeader:: 0,0,-4 request:: '/hbase/rs,F r esponse:: v{} at org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:796) at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:89) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:291) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1039) 2012-05-19 20:19:43,903 WARN org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient Z ooKeeper exception: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = Conne ctionLoss for /hbase/rs 2012-05-19 20:19:43,903 INFO org.apache.hadoop.hbase.util.RetryCounter: The 1 times to retry after sleep ing 2000 ms 2012-05-19 20:19:45,125 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server XXX.XX.XXX.X:2181 2012-05-19 20:19:45,151 INFO org.apache.zookeeper.client.ZooKeeperSaslClient: Found Login Context section 'Client': will use it to attempt to SASL-authenticate. 2012-05-19 20:19:45,152 INFO org.apache.zookeeper.client.ZooKeeperSaslClient: Client will use GSSAPI as S ASL mechanism. 2012-05-19 20:19:45,152 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to XXX.XX.XXX.X:2181, initiating session 2012-05-19 20:19:45,155 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server XXX.XX.XXX.X:2181, sessionid = 0x137683fb0680000, negotiated timeout = 40000 Though, this happened only once, but I am pasting it here if that might be useful... as its HBase related, and its about transaction mis-ordering.
          Hide
          Eugene Koontz added a comment -

          Hi Antonio,
          (my previous post seems to have failed with a 5xx error, apologies if this is a duplicate)

          I encourage you to provide a unit test, either with Curator or directly with just a Zookeeper client/server setup with a shell script. I git-cloned Curator but unfortunately I get some tests failures on trunk with ./gradelw build. Nothing against Curator: I would like to learn more about it, but I think it would be best to show the problem with purely Zookeeper code as much as possible.

          Show
          Eugene Koontz added a comment - Hi Antonio, (my previous post seems to have failed with a 5xx error, apologies if this is a duplicate) I encourage you to provide a unit test, either with Curator or directly with just a Zookeeper client/server setup with a shell script. I git-cloned Curator but unfortunately I get some tests failures on trunk with ./gradelw build . Nothing against Curator: I would like to learn more about it, but I think it would be best to show the problem with purely Zookeeper code as much as possible.
          Hide
          Antonio added a comment -

          Hi,

          I've tested with the latest patch (released on 18/May/12 23:03) but I get the same issue:

          2012-05-19 18:49:44,067 [main-EventThread] DEBUG org.apache.zookeeper.client.ZooKeeperSaslClient  - ServerSaslResponseCallback(): using empty data[] as server response (length=0)
          2012-05-19 18:49:44,067 [main-EventThread] DEBUG org.apache.zookeeper.client.ZooKeeperSaslClient  - saslToken (server) length: 0
          2012-05-19 18:49:44,067 [main-EventThread] DEBUG org.apache.zookeeper.client.ZooKeeperSaslClient  - saslClient.evaluateChallenge(len=0)
          2012-05-19 18:49:44,068 [main-EventThread] ERROR org.apache.zookeeper.client.ZooKeeperSaslClient  - An error: (java.security.PrivilegedActionException: javax.security.sasl.SaslException: DIGEST-MD5: Digest-challenge format violation: algorithm directive missing) occurred when evaluating Zookeeper Quorum Member's  received SASL token. Zookeeper Client will go to AUTH_FAILED state.
          2012-05-19 18:49:44,068 [SyncThread:0] DEBUG org.apache.zookeeper.server.FinalRequestProcessor  - Processing request:: sessionid:0x137666e45f70000 type:sync: cxid:0xa zxid:0xfffffffffffffffe txntype:unknown reqpath:/
          2012-05-19 18:49:44,068 [main-EventThread] ERROR org.apache.zookeeper.client.ZooKeeperSaslClient  - SASL authentication failed using login context 'null'.
          

          Eugene, I'm using as java client Curator. I can provide a junit test to reproduce the issue if it's fine with you.

          The easiest way to reproduce it is with Curator and the provided TestingServer (curator-test.jar) but I got same using the curator client connected to a running zookeeper server.
          That's should be the same since the TestingServer is actually running an actual zookeeper server.

          Let me know and thanks.
          Antonio

          Show
          Antonio added a comment - Hi, I've tested with the latest patch (released on 18/May/12 23:03) but I get the same issue: 2012-05-19 18:49:44,067 [main-EventThread] DEBUG org.apache.zookeeper.client.ZooKeeperSaslClient - ServerSaslResponseCallback(): using empty data[] as server response (length=0) 2012-05-19 18:49:44,067 [main-EventThread] DEBUG org.apache.zookeeper.client.ZooKeeperSaslClient - saslToken (server) length: 0 2012-05-19 18:49:44,067 [main-EventThread] DEBUG org.apache.zookeeper.client.ZooKeeperSaslClient - saslClient.evaluateChallenge(len=0) 2012-05-19 18:49:44,068 [main-EventThread] ERROR org.apache.zookeeper.client.ZooKeeperSaslClient - An error: (java.security.PrivilegedActionException: javax.security.sasl.SaslException: DIGEST-MD5: Digest-challenge format violation: algorithm directive missing) occurred when evaluating Zookeeper Quorum Member's received SASL token. Zookeeper Client will go to AUTH_FAILED state. 2012-05-19 18:49:44,068 [SyncThread:0] DEBUG org.apache.zookeeper.server.FinalRequestProcessor - Processing request:: sessionid:0x137666e45f70000 type:sync: cxid:0xa zxid:0xfffffffffffffffe txntype:unknown reqpath:/ 2012-05-19 18:49:44,068 [main-EventThread] ERROR org.apache.zookeeper.client.ZooKeeperSaslClient - SASL authentication failed using login context 'null'. Eugene, I'm using as java client Curator. I can provide a junit test to reproduce the issue if it's fine with you. The easiest way to reproduce it is with Curator and the provided TestingServer (curator-test.jar) but I got same using the curator client connected to a running zookeeper server. That's should be the same since the TestingServer is actually running an actual zookeeper server. Let me know and thanks. Antonio
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12528189/ZOOKEEPER-1437.patch
          against trunk revision 1337029.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 15 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1081//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1081//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1081//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12528189/ZOOKEEPER-1437.patch against trunk revision 1337029. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 15 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1081//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1081//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1081//console This message is automatically generated.
          Hide
          Eugene Koontz added a comment -

          Restore the interrupt state when catching InterruptedException while waiting for authentication latch (thanks to Zhihong Yu).

          Show
          Eugene Koontz added a comment - Restore the interrupt state when catching InterruptedException while waiting for authentication latch (thanks to Zhihong Yu).
          Hide
          Eugene Koontz added a comment -

          Hi Antonio,

          I think the empty data[] is an artifact of my development process: in my testing, I found that if I use Kerberos as the SASL metchanism, the client's ZooKeeperSaslClient::ServerSaslResponseCallback::processResult() will be supplied with a null data[] array when this method (processResult()) is called for the final time. So I had to handle this situation somehow.

          One could handle the null server response in a different way, as long as you avoid accessing a null pointer, of course - you could check for the null pointer in other places, as your patch does.

          However I don't think this is related to the problem you are seeing; or at least, I'm not able to reproduce your problem myself. I wonder if you could supply a shell script that could reproduce it? Perhaps a script that kills and starts the server repeatedly to cause the client to eventually fail to authenticate?

          -Eugene

          Show
          Eugene Koontz added a comment - Hi Antonio, I think the empty data[] is an artifact of my development process: in my testing, I found that if I use Kerberos as the SASL metchanism, the client's ZooKeeperSaslClient::ServerSaslResponseCallback::processResult() will be supplied with a null data[] array when this method (processResult()) is called for the final time. So I had to handle this situation somehow. One could handle the null server response in a different way, as long as you avoid accessing a null pointer, of course - you could check for the null pointer in other places, as your patch does. However I don't think this is related to the problem you are seeing; or at least, I'm not able to reproduce your problem myself. I wonder if you could supply a shell script that could reproduce it? Perhaps a script that kills and starts the server repeatedly to cause the client to eventually fail to authenticate? -Eugene
          Hide
          Ted Yu added a comment -
          +                } catch (InterruptedException e) {
          +                    // ignore interruption and keep waiting for latch.
          +                }
          

          Should the interrupt state be restored in the above catch block ?

          Show
          Ted Yu added a comment - + } catch (InterruptedException e) { + // ignore interruption and keep waiting for latch. + } Should the interrupt state be restored in the above catch block ?
          Hide
          Antonio added a comment -

          Hi Eugene,

          It's actually happening only on a linux machine. For some reasons doesn't happen on windows. But I got the impression that it's related to how fast the process goes. Perhaps a race condition somewhere?

          (The path is in windows style because I've generated the diff from an eclipse running on windows)

          The sasl setup is the one shown on https://cwiki.apache.org/ZOOKEEPER/zookeeper-and-sasl.html

          Server {
                 org.apache.zookeeper.server.auth.DigestLoginModule required
                 user_super="adminsecret" 
                 user_bob="bobsecret";
          };
          
          Client {
                 org.apache.zookeeper.server.auth.DigestLoginModule required
                 username="bob"
                 password="bobsecret";
          };
          

          The authentication usually works. this issue happens only if I kill the server and restart it again and not all the time.

          What about that "empty data[] as server response"? is it what for?

          Thanks
          Antonio

          Show
          Antonio added a comment - Hi Eugene, It's actually happening only on a linux machine. For some reasons doesn't happen on windows. But I got the impression that it's related to how fast the process goes. Perhaps a race condition somewhere? (The path is in windows style because I've generated the diff from an eclipse running on windows) The sasl setup is the one shown on https://cwiki.apache.org/ZOOKEEPER/zookeeper-and-sasl.html Server { org.apache.zookeeper.server.auth.DigestLoginModule required user_super="adminsecret" user_bob="bobsecret"; }; Client { org.apache.zookeeper.server.auth.DigestLoginModule required username="bob" password="bobsecret"; }; The authentication usually works. this issue happens only if I kill the server and restart it again and not all the time. What about that "empty data[] as server response"? is it what for? Thanks Antonio
          Hide
          Eugene Koontz added a comment -

          Antonio, interesting results - you are finding that sending an empty SASL response to the Zookeeper server causes an authentication failure? Can you show your JAAS configuration for client and server? I see that your paths have backslashes in them - you are running the client on Microsoft Windows I assume? (not that this should be necessarily be a problem).

          Show
          Eugene Koontz added a comment - Antonio, interesting results - you are finding that sending an empty SASL response to the Zookeeper server causes an authentication failure? Can you show your JAAS configuration for client and server? I see that your paths have backslashes in them - you are running the client on Microsoft Windows I assume? (not that this should be necessarily be a problem).
          Hide
          Antonio added a comment -

          Hi All,

          In a scenario where the zookeeper server gets killed and restarted while a java zookeeper client is connected I get the following in the log: (Please note the Error at bottom)

          [main-SendThread(localhost:60728)] DEBUG org.apache.zookeeper.client.ZooKeeperSaslClient  - ClientCnxn:sendSaslPacket:length=0
          [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:60728] DEBUG org.apache.zookeeper.server.ZooKeeperServer  - Responding to client SASL token.
          [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:60728] DEBUG org.apache.zookeeper.server.ZooKeeperServer  - Size of client SASL token: 0
          [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:60728] DEBUG org.apache.zookeeper.server.ZooKeeperServer  - Size of server SASL response: 101
          [main-SendThread(localhost:60728)] DEBUG org.apache.zookeeper.ClientCnxn  - Reading reply sessionid:0x1376082ca4f0000, packet:: clientPath:null serverPath:null finished:false header:: 9,102  replyHeader:: 9,0,0  request::   response:: #7265616c6d3d227a6b2d7361736c2d6d6435222c6e6f6e63653d22717a485932306a56743534552f483668315837734e2b4b56656e4c335772666458716f6468336247222c636861727365743d7574662d382c616c676f726974686d3d6d64352d73657373
          2012-05-18 15:14:16,908 [SyncThread:0] DEBUG org.apache.zookeeper.server.FinalRequestProcessor  - Processing request:: sessionid:0x1376082ca4f0000 type:ping cxid:0xfffffffffffffffe zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a
          2012-05-18 15:14:16,908 [SyncThread:0] DEBUG org.apache.zookeeper.server.FinalRequestProcessor  - sessionid:0x1376082ca4f0000 type:ping cxid:0xfffffffffffffffe zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a
          
          [main-EventThread] DEBUG org.apache.zookeeper.client.ZooKeeperSaslClient  - ServerSaslResponseCallback(): using empty data[] as server response (length=0)
          [main-EventThread] DEBUG org.apache.zookeeper.client.ZooKeeperSaslClient  - saslToken (server) length: 0
          [main-EventThread] DEBUG org.apache.zookeeper.client.ZooKeeperSaslClient  - saslClient.evaluateChallenge(len=0)
          [main-EventThread] ERROR org.apache.zookeeper.client.ZooKeeperSaslClient  - An error: (java.security.PrivilegedActionException: javax.security.sasl.SaslException: DIGEST-MD5: Digest-challenge format violation: algorithm directive missing) occurred when evaluating Zookeeper Quorum Member's  received SASL token. Zookeeper Client will go to AUTH_FAILED state.
          [main-EventThread] ERROR org.apache.zookeeper.client.ZooKeeperSaslClient  - SASL authentication failed using login context 'null'.
          

          I was wondering why the "empty data[]" is expected since it always drives in my case to a SASL authentication failed.

          I know this is not the correct set up but at moment only one Zk server is set up. Not sure if this is actually causing the issue.

          Just for test I've applied the last patch provided on this ticket with date 16/May/12 18:17
          and then in ClientCnxn.java:565 I've added a null check for rsp.getToken().. and I'm not getting that issue anymore.

          Index: src\java\main\org\apache\zookeeper\ClientCnxn.java
          ===================================================================
          --- src\java\main\org\apache\zookeeper\ClientCnxn.java	(revision 95556)
          +++ src\java\main\org\apache\zookeeper\ClientCnxn.java	(working copy)
          @@ -563,6 +563,7 @@
                                 ZooKeeperSaslClient.ServerSaslResponseCallback cb = (ZooKeeperSaslClient.ServerSaslResponseCallback) p.cb;
                                 SetSASLResponse rsp = (SetSASLResponse) p.response;
                                 // TODO : check rc (== 0, etc) as with other packet types.
          +                      if (null != rsp.getToken()){
                                 cb.processResult(rc,null,p.ctx,rsp.getToken(),null);
                                 ClientCnxn clientCnxn = (ClientCnxn)p.ctx;
                                 if ((clientCnxn == null) || (clientCnxn.zooKeeperSaslClient == null) ||
          @@ -571,6 +572,7 @@
                                     queueEvent(new WatchedEvent(EventType.None,
                                             KeeperState.AuthFailed, null));
                                 }
          +                      }
                             } else if (p.response instanceof GetDataResponse) {
                                 DataCallback cb = (DataCallback) p.cb;
                                 GetDataResponse rsp = (GetDataResponse) p.response;
          
          
          Show
          Antonio added a comment - Hi All, In a scenario where the zookeeper server gets killed and restarted while a java zookeeper client is connected I get the following in the log: (Please note the Error at bottom) [main-SendThread(localhost:60728)] DEBUG org.apache.zookeeper.client.ZooKeeperSaslClient - ClientCnxn:sendSaslPacket:length=0 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:60728] DEBUG org.apache.zookeeper.server.ZooKeeperServer - Responding to client SASL token. [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:60728] DEBUG org.apache.zookeeper.server.ZooKeeperServer - Size of client SASL token: 0 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:60728] DEBUG org.apache.zookeeper.server.ZooKeeperServer - Size of server SASL response: 101 [main-SendThread(localhost:60728)] DEBUG org.apache.zookeeper.ClientCnxn - Reading reply sessionid:0x1376082ca4f0000, packet:: clientPath:null serverPath:null finished:false header:: 9,102 replyHeader:: 9,0,0 request:: response:: #7265616c6d3d227a6b2d7361736c2d6d6435222c6e6f6e63653d22717a485932306a56743534552f483668315837734e2b4b56656e4c335772666458716f6468336247222c636861727365743d7574662d382c616c676f726974686d3d6d64352d73657373 2012-05-18 15:14:16,908 [SyncThread:0] DEBUG org.apache.zookeeper.server.FinalRequestProcessor - Processing request:: sessionid:0x1376082ca4f0000 type:ping cxid:0xfffffffffffffffe zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a 2012-05-18 15:14:16,908 [SyncThread:0] DEBUG org.apache.zookeeper.server.FinalRequestProcessor - sessionid:0x1376082ca4f0000 type:ping cxid:0xfffffffffffffffe zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a [main-EventThread] DEBUG org.apache.zookeeper.client.ZooKeeperSaslClient - ServerSaslResponseCallback(): using empty data[] as server response (length=0) [main-EventThread] DEBUG org.apache.zookeeper.client.ZooKeeperSaslClient - saslToken (server) length: 0 [main-EventThread] DEBUG org.apache.zookeeper.client.ZooKeeperSaslClient - saslClient.evaluateChallenge(len=0) [main-EventThread] ERROR org.apache.zookeeper.client.ZooKeeperSaslClient - An error: (java.security.PrivilegedActionException: javax.security.sasl.SaslException: DIGEST-MD5: Digest-challenge format violation: algorithm directive missing) occurred when evaluating Zookeeper Quorum Member's received SASL token. Zookeeper Client will go to AUTH_FAILED state. [main-EventThread] ERROR org.apache.zookeeper.client.ZooKeeperSaslClient - SASL authentication failed using login context 'null'. I was wondering why the "empty data[]" is expected since it always drives in my case to a SASL authentication failed. I know this is not the correct set up but at moment only one Zk server is set up. Not sure if this is actually causing the issue. Just for test I've applied the last patch provided on this ticket with date 16/May/12 18:17 and then in ClientCnxn.java:565 I've added a null check for rsp.getToken().. and I'm not getting that issue anymore. Index: src\java\main\org\apache\zookeeper\ClientCnxn.java =================================================================== --- src\java\main\org\apache\zookeeper\ClientCnxn.java (revision 95556) +++ src\java\main\org\apache\zookeeper\ClientCnxn.java (working copy) @@ -563,6 +563,7 @@ ZooKeeperSaslClient.ServerSaslResponseCallback cb = (ZooKeeperSaslClient.ServerSaslResponseCallback) p.cb; SetSASLResponse rsp = (SetSASLResponse) p.response; // TODO : check rc (== 0, etc) as with other packet types. + if (null != rsp.getToken()){ cb.processResult(rc,null,p.ctx,rsp.getToken(),null); ClientCnxn clientCnxn = (ClientCnxn)p.ctx; if ((clientCnxn == null) || (clientCnxn.zooKeeperSaslClient == null) || @@ -571,6 +572,7 @@ queueEvent(new WatchedEvent(EventType.None, KeeperState.AuthFailed, null)); } + } } else if (p.response instanceof GetDataResponse) { DataCallback cb = (DataCallback) p.cb; GetDataResponse rsp = (GetDataResponse) p.response;
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12527672/ZOOKEEPER-1437.patch
          against trunk revision 1337029.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 15 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1076//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1076//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1076//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12527672/ZOOKEEPER-1437.patch against trunk revision 1337029. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 15 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1076//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1076//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1076//console This message is automatically generated.
          Hide
          Eugene Koontz added a comment -

          Excuse me, I meant "ClientCnxn:queuePacket()", not "ClientCnxn:queueSaslPacket()" in my 20:26 comment above.

          Show
          Eugene Koontz added a comment - Excuse me, I meant "ClientCnxn:queuePacket()", not "ClientCnxn:queueSaslPacket()" in my 20:26 comment above.
          Hide
          Eugene Koontz added a comment -

          Use a CountDownLatch within ClientCnxn to control access to outgoing packet queue: non-SASL packets must wait until SASL authentication has completed.

          Show
          Eugene Koontz added a comment - Use a CountDownLatch within ClientCnxn to control access to outgoing packet queue: non-SASL packets must wait until SASL authentication has completed.
          Hide
          Eugene Koontz added a comment -

          Hi Patrick,

          I think I understand you but I want to make sure:

          Your idea is that ClientCnxn:queueSaslPacket() will use a java.util.concurrent.CountDownLatch to determine when SASL authentication is complete. The client's Main thread will await() on this latch. Meanwhile, the client's SendThread will be sending and receiving SASL packets until the ZooKeeperSaslClient has finished authenticating with the server. When SASL authentication is finished (successfully or not), the Send thread will countDown() the latch, and the Main thread will be able to continue, causing its packets to be enqueued on the outgoingQueue.

          NIOServerCnxn::doIO() will not need any changes - it will simply send and receive packets in the outgoingQueue's natural ordering, without regard to their type. But because of the latch in ClientCnxn:queueSaslPacket(), the packets will be ordered so that no permissions-requiring packets are sent before authentication completes.

          Show
          Eugene Koontz added a comment - Hi Patrick, I think I understand you but I want to make sure: Your idea is that ClientCnxn:queueSaslPacket() will use a java.util.concurrent.CountDownLatch to determine when SASL authentication is complete. The client's Main thread will await() on this latch. Meanwhile, the client's SendThread will be sending and receiving SASL packets until the ZooKeeperSaslClient has finished authenticating with the server. When SASL authentication is finished (successfully or not), the Send thread will countDown() the latch, and the Main thread will be able to continue, causing its packets to be enqueued on the outgoingQueue. NIOServerCnxn::doIO() will not need any changes - it will simply send and receive packets in the outgoingQueue's natural ordering, without regard to their type. But because of the latch in ClientCnxn:queueSaslPacket(), the packets will be ordered so that no permissions-requiring packets are sent before authentication completes.
          Hide
          Andrew Purtell added a comment -

          The RegionServers start reliably. Simple test scenarios complete ok,
          nothing seems broken, but I haven't stressed it.

          Show
          Andrew Purtell added a comment - The RegionServers start reliably. Simple test scenarios complete ok, nothing seems broken, but I haven't stressed it.
          Hide
          Patrick Hunt added a comment -

          Thanks Andrew. When you say "it's much better" does that mean everything works, or works better? If the latter what's broken?

          Show
          Patrick Hunt added a comment - Thanks Andrew. When you say "it's much better" does that mean everything works, or works better? If the latter what's broken?
          Hide
          Andrew Purtell added a comment -

          I won't intrude on the design discussion, sounds good.

          Just want to add a datapoint that with HBase 0.92.1 + Hadoop 1.0.2 + ZooKeeper 3.4.3 enough has changed timing wise from the time of the initial SASL auth contribution to make starting HBase RegionServers problematic. When I apply the latest patch on this issue, it's much better.

          Show
          Andrew Purtell added a comment - I won't intrude on the design discussion, sounds good. Just want to add a datapoint that with HBase 0.92.1 + Hadoop 1.0.2 + ZooKeeper 3.4.3 enough has changed timing wise from the time of the initial SASL auth contribution to make starting HBase RegionServers problematic. When I apply the latest patch on this issue, it's much better.
          Hide
          Patrick Hunt added a comment -

          Granted doing what I suggested (remove the "requires auth" check) would require you to implement my suggestion above, specifically

          Perhaps rather than queuing packets at all for sasl perhaps we should generate them dynamically based on the current sendThread state and the saslclient state?

          Show
          Patrick Hunt added a comment - Granted doing what I suggested (remove the "requires auth" check) would require you to implement my suggestion above, specifically Perhaps rather than queuing packets at all for sasl perhaps we should generate them dynamically based on the current sendThread state and the saslclient state?
          Hide
          Patrick Hunt added a comment -

          I don't think we should have the "requires auth" check. We should guarantee order. That seems dangerous to me for a couple reasons. Was there a reason you added it?

          I'd also suggest that you latch the "in progress" indicator so that you don't need to constantly call that method. ie three states "sasl auth pending" in which you still have to check, vs "sasl auth completed" /"no sasl auth" in which case you don't.

          One concern is that we'll hog the cpu in this case. what I mean is. if sasl is in progress, and we just return out of doio, the "write interest" is still set. So the OS select will wake up immediately to allow us to write, which we won't do (still waiting), etc.... Not sure how to address that (or if it's really a problem), need to look at it a bit more. What do you think though?

          Show
          Patrick Hunt added a comment - I don't think we should have the "requires auth" check. We should guarantee order. That seems dangerous to me for a couple reasons. Was there a reason you added it? I'd also suggest that you latch the "in progress" indicator so that you don't need to constantly call that method. ie three states "sasl auth pending" in which you still have to check, vs "sasl auth completed" /"no sasl auth" in which case you don't. One concern is that we'll hog the cpu in this case. what I mean is. if sasl is in progress, and we just return out of doio, the "write interest" is still set. So the OS select will wake up immediately to allow us to write, which we won't do (still waiting), etc.... Not sure how to address that (or if it's really a problem), need to look at it a bit more. What do you think though?
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12525918/ZOOKEEPER-1437.patch
          against trunk revision 1334548.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 15 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1065//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1065//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1065//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12525918/ZOOKEEPER-1437.patch against trunk revision 1334548. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 15 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1065//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1065//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1065//console This message is automatically generated.
          Hide
          Eugene Koontz added a comment -

          -pass simply a boolean flag to doIO() and doTransport() rather than entire Cnxn.SendThread - will be faster too, since clientTunneledAuthenticationInProgress will only be called once per doTransport() call.
          -move operationRequiresPermissions() to ZooDefs.java and make static
          -wrap new LOG.debug() message inside a (LOG.isDebugEnabled()) conditional.

          Show
          Eugene Koontz added a comment - -pass simply a boolean flag to doIO() and doTransport() rather than entire Cnxn.SendThread - will be faster too, since clientTunneledAuthenticationInProgress will only be called once per doTransport() call. -move operationRequiresPermissions() to ZooDefs.java and make static -wrap new LOG.debug() message inside a (LOG.isDebugEnabled()) conditional.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12525802/ZOOKEEPER-1437.patch
          against trunk revision 1334548.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 15 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1064//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1064//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1064//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12525802/ZOOKEEPER-1437.patch against trunk revision 1334548. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 15 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1064//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1064//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1064//console This message is automatically generated.
          Hide
          Eugene Koontz added a comment -

          Fix test failure, add whitespace, remove irrelevant changes from trunk.

          Show
          Eugene Koontz added a comment - Fix test failure, add whitespace, remove irrelevant changes from trunk.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12525786/ZOOKEEPER-1437.patch
          against trunk revision 1334548.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 15 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1063//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1063//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1063//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12525786/ZOOKEEPER-1437.patch against trunk revision 1334548. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 15 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1063//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1063//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1063//console This message is automatically generated.
          Hide
          Eugene Koontz added a comment -

          As with patches in March, modify tests to remove now-unnecessary sleeps, and set authInitFailed = true where needed.

          Show
          Eugene Koontz added a comment - As with patches in March, modify tests to remove now-unnecessary sleeps, and set authInitFailed = true where needed.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12525779/ZOOKEEPER-1437.patch
          against trunk revision 1334548.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1062//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1062//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1062//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12525779/ZOOKEEPER-1437.patch against trunk revision 1334548. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1062//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1062//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1062//console This message is automatically generated.
          Hide
          Eugene Koontz added a comment -

          Modify doIO() so that if SASL authentication is in process, defer sending of packets in the outgoingQueue that are subject to permissions-checking until SASL authentication is complete, per Patrick's recommendation.

          Show
          Eugene Koontz added a comment - Modify doIO() so that if SASL authentication is in process, defer sending of packets in the outgoingQueue that are subject to permissions-checking until SASL authentication is complete, per Patrick's recommendation.
          Hide
          Patrick Hunt added a comment -

          Sorry to come in on this late, but I've really been swamped.

          From what I can see this patch only fixes synchronous operations (those that call submitrequest) and not async (which call queuepacket directly).

          I'm wondering why approach it this way rather than checking if we are in the auth phase in "doIO" and then only sending auth packets until the auth phase has been finalized. (leave any non-auth packets in the queue)

          Perhaps rather than queuing packets at all for sasl perhaps we should generate them dynamically based on the current sendThread state and the saslclient state?

          Show
          Patrick Hunt added a comment - Sorry to come in on this late, but I've really been swamped. From what I can see this patch only fixes synchronous operations (those that call submitrequest) and not async (which call queuepacket directly). I'm wondering why approach it this way rather than checking if we are in the auth phase in "doIO" and then only sending auth packets until the auth phase has been finalized. (leave any non-auth packets in the queue) Perhaps rather than queuing packets at all for sasl perhaps we should generate them dynamically based on the current sendThread state and the saslclient state?
          Hide
          Thomas Weise added a comment -

          Thanks Eugene, the last patch did it! Login synchronization works (NoAuth failures in HBase are gone and no processes hanging). I have not done any negative testing though. Nice to see secure ZK working with HBase!

          I have applied the patch to 3.4.x and adding it as fix version.

          Show
          Thomas Weise added a comment - Thanks Eugene, the last patch did it! Login synchronization works (NoAuth failures in HBase are gone and no processes hanging). I have not done any negative testing though. Nice to see secure ZK working with HBase! I have applied the patch to 3.4.x and adding it as fix version.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12520699/ZOOKEEPER-1437.patch
          against trunk revision 1307644.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 19 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1024//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1024//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1024//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12520699/ZOOKEEPER-1437.patch against trunk revision 1307644. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 19 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1024//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1024//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1024//console This message is automatically generated.
          Hide
          Eugene Koontz added a comment -

          Thanks Thomas for your testing. Can you please try the latest patch and see if it the client doesn't hang anymore?
          -Eugene

          Show
          Eugene Koontz added a comment - Thanks Thomas for your testing. Can you please try the latest patch and see if it the client doesn't hang anymore? -Eugene
          Hide
          Eugene Koontz added a comment -

          Fixes two bugs in the last patch:

          -authSync synchronization object was being created twice

          -authResultNotify() was not being called in one SASL failure mode

          Show
          Eugene Koontz added a comment - Fixes two bugs in the last patch: -authSync synchronization object was being created twice -authResultNotify() was not being called in one SASL failure mode
          Hide
          Thomas Weise added a comment -

          I see this consistently happening in the hbase client also:

          "main" prio=10 tid=0x08061000 nid=0x5af6 in Object.wait() [0xf741c000]
             java.lang.Thread.State: WAITING (on object monitor)
                  at java.lang.Object.wait(Native Method)
                  - waiting on <0xf2674320> (a java.lang.Object)
                  at java.lang.Object.wait(Object.java:485)
                  at org.apache.zookeeper.ClientCnxn.waitForAuthenticationIfNecessary(ClientCnxn.java:1344)
                  - locked <0xf2674320> (a java.lang.Object)
                  at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1355)
                  at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1031)
                  at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:154)
                  at org.apache.hadoop.hbase.zookeeper.ZKUtil.watchAndCheckExists(ZKUtil.java:226)
                  at org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.start(ZooKeeperNodeTracker.java:76)
                  - locked <0xf2675948> (a org.apache.hadoop.hbase.zookeeper.RootRegionTracker)
                  at org.apache.hadoop.hbase.catalog.CatalogTracker.start(CatalogTracker.java:233)
                  at org.apache.hadoop.hbase.client.HBaseAdmin.getCatalogTracker(HBaseAdmin.java:143)
                  - locked <0xf3f76370> (a org.apache.hadoop.hbase.client.HBaseAdmin)
                  at org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:200)
          

          Authentication succeeds on the server but the client waits forever.

          Show
          Thomas Weise added a comment - I see this consistently happening in the hbase client also: "main" prio=10 tid=0x08061000 nid=0x5af6 in Object .wait() [0xf741c000] java.lang. Thread .State: WAITING (on object monitor) at java.lang. Object .wait(Native Method) - waiting on <0xf2674320> (a java.lang. Object ) at java.lang. Object .wait( Object .java:485) at org.apache.zookeeper.ClientCnxn.waitForAuthenticationIfNecessary(ClientCnxn.java:1344) - locked <0xf2674320> (a java.lang. Object ) at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1355) at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1031) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:154) at org.apache.hadoop.hbase.zookeeper.ZKUtil.watchAndCheckExists(ZKUtil.java:226) at org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.start(ZooKeeperNodeTracker.java:76) - locked <0xf2675948> (a org.apache.hadoop.hbase.zookeeper.RootRegionTracker) at org.apache.hadoop.hbase.catalog.CatalogTracker.start(CatalogTracker.java:233) at org.apache.hadoop.hbase.client.HBaseAdmin.getCatalogTracker(HBaseAdmin.java:143) - locked <0xf3f76370> (a org.apache.hadoop.hbase.client.HBaseAdmin) at org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:200) Authentication succeeds on the server but the client waits forever.
          Hide
          Thomas Weise added a comment -

          With the patch HBase region server will hang (similar problem in HBase shell, updated only the zookeeper client, no change to server):

          "regionserver60020" prio=10 tid=0x0846ac00 nid=0x6032 in Object.wait() [0xaf2ad000]
             java.lang.Thread.State: WAITING (on object monitor)
                  at java.lang.Object.wait(Native Method)
                  - waiting on <0xdf5c1e30> (a java.lang.Object)
                  at java.lang.Object.wait(Object.java:485)
                  at org.apache.zookeeper.ClientCnxn.waitForAuthenticationIfNecessary(ClientCnxn.java:1344)
                  - locked <0xdf5c1e30> (a java.lang.Object)
                  at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1355)
                  at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1031)
                  at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:154)
                  at org.apache.hadoop.hbase.zookeeper.ZKUtil.watchAndCheckExists(ZKUtil.java:226)
                  at org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.start(ZooKeeperNodeTracker.java:76)
                  - locked <0xdf5c38e8> (a org.apache.hadoop.hbase.MasterAddressTracker)
                  at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.setupZookeeperTrackers(HConnectionManager.java:580)
                  - locked <0xdf5b8120> (a org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation)
                  at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.<init>(HConnectionManager.java:569)
                  at org.apache.hadoop.hbase.client.HConnectionManager.getConnection(HConnectionManager.java:186)
                  - locked <0xe064fa10> (a org.apache.hadoop.hbase.client.HConnectionManager$1)
                  at org.apache.hadoop.hbase.catalog.CatalogTracker.<init>(CatalogTracker.java:178)
                  at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:565)
                  at org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:524)
                  at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:625)
                  at java.lang.Thread.run(Thread.java:619)
          
          Show
          Thomas Weise added a comment - With the patch HBase region server will hang (similar problem in HBase shell, updated only the zookeeper client, no change to server): "regionserver60020" prio=10 tid=0x0846ac00 nid=0x6032 in Object .wait() [0xaf2ad000] java.lang. Thread .State: WAITING (on object monitor) at java.lang. Object .wait(Native Method) - waiting on <0xdf5c1e30> (a java.lang. Object ) at java.lang. Object .wait( Object .java:485) at org.apache.zookeeper.ClientCnxn.waitForAuthenticationIfNecessary(ClientCnxn.java:1344) - locked <0xdf5c1e30> (a java.lang. Object ) at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1355) at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1031) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:154) at org.apache.hadoop.hbase.zookeeper.ZKUtil.watchAndCheckExists(ZKUtil.java:226) at org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.start(ZooKeeperNodeTracker.java:76) - locked <0xdf5c38e8> (a org.apache.hadoop.hbase.MasterAddressTracker) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.setupZookeeperTrackers(HConnectionManager.java:580) - locked <0xdf5b8120> (a org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.<init>(HConnectionManager.java:569) at org.apache.hadoop.hbase.client.HConnectionManager.getConnection(HConnectionManager.java:186) - locked <0xe064fa10> (a org.apache.hadoop.hbase.client.HConnectionManager$1) at org.apache.hadoop.hbase.catalog.CatalogTracker.<init>(CatalogTracker.java:178) at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:565) at org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:524) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:625) at java.lang. Thread .run( Thread .java:619)
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12520671/ZOOKEEPER-1437.patch
          against trunk revision 1307191.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 19 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1023//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1023//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1023//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12520671/ZOOKEEPER-1437.patch against trunk revision 1307191. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 19 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1023//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1023//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1023//console This message is automatically generated.
          Hide
          Eugene Koontz added a comment -

          Fixes the two Findbugs warnings reported by last patch.

          Show
          Eugene Koontz added a comment - Fixes the two Findbugs warnings reported by last patch.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12520635/ZOOKEEPER-1437.patch
          against trunk revision 1307191.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 19 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1022//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1022//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1022//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12520635/ZOOKEEPER-1437.patch against trunk revision 1307191. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 19 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1022//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1022//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1022//console This message is automatically generated.
          Hide
          Thomas Weise added a comment -

          Hi Eugene,

          Thanks for looking into it, I will test the patch later today. Can we get this fixed for 3.4.x?

          Show
          Thomas Weise added a comment - Hi Eugene, Thanks for looking into it, I will test the patch later today. Can we get this fixed for 3.4.x?
          Hide
          Eugene Koontz added a comment -

          Passes test-core-java.

          Show
          Eugene Koontz added a comment - Passes test-core-java.
          Hide
          Eugene Koontz added a comment -

          This patch adds an authSync object to ClientCnxn. If client is configured for SASL authentication, client's main thread will wait() on authSync. SendThread will authSync() when SASL negotiation with Zookeeper server is complete (whether successfully or not).

          I removed the sleep()s from SaslAuthTest and other Sasl tests which were hiding this issue : now the clients immediately attempt to create zk nodes, as with to Thomas's test case of running zkCli with an immediate get.

          However, because of removing the sleep()s that hid the issue, the test SaslAuthMissingClientConfigTest hangs. This is because the zooKeeperSaslClient is null because SASL configuration is intentionally bad and the SASL setup has therefore failed. This failure mode needs to be distinguished from the case where zooKeeperSaslClient is null because it hasn't been constructed yet. This remains to be done, so not marking this issue as "Patch Available" yet.

          Show
          Eugene Koontz added a comment - This patch adds an authSync object to ClientCnxn. If client is configured for SASL authentication, client's main thread will wait() on authSync. SendThread will authSync() when SASL negotiation with Zookeeper server is complete (whether successfully or not). I removed the sleep()s from SaslAuthTest and other Sasl tests which were hiding this issue : now the clients immediately attempt to create zk nodes, as with to Thomas's test case of running zkCli with an immediate get. However, because of removing the sleep()s that hid the issue, the test SaslAuthMissingClientConfigTest hangs. This is because the zooKeeperSaslClient is null because SASL configuration is intentionally bad and the SASL setup has therefore failed. This failure mode needs to be distinguished from the case where zooKeeperSaslClient is null because it hasn't been constructed yet. This remains to be done, so not marking this issue as "Patch Available" yet.
          Hide
          Eugene Koontz added a comment -

          Hi Thomas,

          Thanks for diagnosing this. This problem is occuring because the client's 'Main' thread (Main of zkCli.sh) is trying to access resources (e.g. with getData) before the sendThread has finished SASL authentication with the Zookeeper server. Thus the fix is to have the client's Main thread wait in submitRequest() for a notification from the SendThread that authentication has completed.

          I am working on a patch now.

          Show
          Eugene Koontz added a comment - Hi Thomas, Thanks for diagnosing this. This problem is occuring because the client's 'Main' thread (Main of zkCli.sh) is trying to access resources (e.g. with getData) before the sendThread has finished SASL authentication with the Zookeeper server. Thus the fix is to have the client's Main thread wait in submitRequest() for a notification from the SendThread that authentication has completed. I am working on a patch now.
          Hide
          Thomas Weise added a comment -

          PASSES: Interactive getData on ACL protected node:

          ./zkCli.sh -server gsbl90247

          [zk: gsbl90247(CONNECTED) 1] ls /hbase
          [splitlog, unassigned, root-region-server, rs, draining, table, master, tokenauth, shutdown, hbaseid]

          [zk: gsbl90247(CONNECTED) 2] getAcl /hbase
          'sasl,'hbase
          : cdrwa

          This works because by the time the ls /hbase is executed, SASL authentication is complete (WatchedEvent state:SaslAuthenticated)

          FAILS: When running the command immediately:

          ./zkCli.sh -server gsbl90247 -e "ls /hbase"

          WATCHER::

          WatchedEvent state:SyncConnected type:None path:null
          Exception in thread "main" org.apache.zookeeper.KeeperException$NoAuthException: KeeperErrorCode = NoAuth for /hbase
          at org.apache.zookeeper.KeeperException.create(KeeperException.java:113)
          at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
          at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1448)
          at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1476)
          at org.apache.zookeeper.ZooKeeperMain.processZKCmd(ZooKeeperMain.java:717)
          at org.apache.zookeeper.ZooKeeperMain.processCmd(ZooKeeperMain.java:593)
          at org.apache.zookeeper.ZooKeeperMain.run(ZooKeeperMain.java:354)
          at org.apache.zookeeper.ZooKeeperMain.main(ZooKeeperMain.java:282)

          WATCHER::

          WatchedEvent state:SaslAuthenticated type:None path:null

          getData occurs prior to SaslAuthenticated.

          Show
          Thomas Weise added a comment - PASSES: Interactive getData on ACL protected node: ./zkCli.sh -server gsbl90247 [zk: gsbl90247(CONNECTED) 1] ls /hbase [splitlog, unassigned, root-region-server, rs, draining, table, master, tokenauth, shutdown, hbaseid] [zk: gsbl90247(CONNECTED) 2] getAcl /hbase 'sasl,'hbase : cdrwa This works because by the time the ls /hbase is executed, SASL authentication is complete (WatchedEvent state:SaslAuthenticated) FAILS: When running the command immediately: ./zkCli.sh -server gsbl90247 -e "ls /hbase" WATCHER:: WatchedEvent state:SyncConnected type:None path:null Exception in thread "main" org.apache.zookeeper.KeeperException$NoAuthException: KeeperErrorCode = NoAuth for /hbase at org.apache.zookeeper.KeeperException.create(KeeperException.java:113) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1448) at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1476) at org.apache.zookeeper.ZooKeeperMain.processZKCmd(ZooKeeperMain.java:717) at org.apache.zookeeper.ZooKeeperMain.processCmd(ZooKeeperMain.java:593) at org.apache.zookeeper.ZooKeeperMain.run(ZooKeeperMain.java:354) at org.apache.zookeeper.ZooKeeperMain.main(ZooKeeperMain.java:282) WATCHER:: WatchedEvent state:SaslAuthenticated type:None path:null getData occurs prior to SaslAuthenticated.

            People

            • Assignee:
              Eugene Koontz
              Reporter:
              Thomas Weise
            • Votes:
              0 Vote for this issue
              Watchers:
              19 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development