ZooKeeper
  1. ZooKeeper
  2. ZOOKEEPER-1105

c client zookeeper_close not send CLOSE_OP request to server

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 3.3.2, 3.4.3
    • Fix Version/s: 3.5.0
    • Component/s: c client
    • Labels:
      None

      Description

      in zookeeper_close function, do adaptor_finish before send CLOSE_OP request to server
      so the CLOSE_OP request can not be sent to server

      in server zookeeper.log have many
      2011-06-22 00:23:02,323 - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@634] - EndOfStreamException: Unable to read additional data from client sessionid 0x1305970d66d2224, likely client has closed socket
      2011-06-22 00:23:02,324 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1435] - Closed socket connection for client /10.250.8.123:60257 which had sessionid 0x1305970d66d2224
      2011-06-22 00:23:02,325 - ERROR [CommitProcessor:1:NIOServerCnxn@445] - Unexpected Exception:
      java.nio.channels.CancelledKeyException
      at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55)
      at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59)
      at org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:418)
      at org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1509)
      at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:367)
      at org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:73)

      and java client not have this problem

      1. zktest.java
        2 kB
        Diego Ongaro
      2. zktest.c
        2 kB
        Diego Ongaro
      3. zklog.txt
        13 kB
        Diego Ongaro
      4. ZOOKEEPER-1105.patch
        1 kB
        lincoln.lee
      5. ZOOKEEPER-1105v1.patch
        1 kB
        lincoln.lee

        Activity

        Hide
        Michi Mutsuzaki added a comment -

        The C client doesn't compile with this patch anymore because of the logging macro change.

        Show
        Michi Mutsuzaki added a comment - The C client doesn't compile with this patch anymore because of the logging macro change.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12546985/ZOOKEEPER-1105v1.patch
        against trunk revision 1530166.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1657//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1657//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1657//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12546985/ZOOKEEPER-1105v1.patch against trunk revision 1530166. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1657//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1657//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1657//console This message is automatically generated.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12546985/ZOOKEEPER-1105v1.patch
        against trunk revision 1389711.

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1193//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1193//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1193//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12546985/ZOOKEEPER-1105v1.patch against trunk revision 1389711. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1193//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1193//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1193//console This message is automatically generated.
        Hide
        lincoln.lee added a comment -

        use select() for windows

        Show
        lincoln.lee added a comment - use select() for windows
        Hide
        lincoln.lee added a comment -

        check "#ifndef WIN32" use select() for windows

        Show
        lincoln.lee added a comment - check "#ifndef WIN32" use select() for windows
        Hide
        Hudson added a comment -

        Integrated in ZooKeeper-trunk #1682 (See https://builds.apache.org/job/ZooKeeper-trunk/1682/)
        Revert 1380931: ZOOKEEPER-1105 c client zookeeper_close not send CLOSE_OP
        request to server (lincoln.lee via michim) (Revision 1385378)

        Result = SUCCESS
        michim : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1385378
        Files :

        • /zookeeper/trunk/CHANGES.txt
        • /zookeeper/trunk/src/c/src/zookeeper.c
        Show
        Hudson added a comment - Integrated in ZooKeeper-trunk #1682 (See https://builds.apache.org/job/ZooKeeper-trunk/1682/ ) Revert 1380931: ZOOKEEPER-1105 c client zookeeper_close not send CLOSE_OP request to server (lincoln.lee via michim) (Revision 1385378) Result = SUCCESS michim : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1385378 Files : /zookeeper/trunk/CHANGES.txt /zookeeper/trunk/src/c/src/zookeeper.c
        Hide
        Mahadev konar added a comment -

        Thanks Michi!

        Show
        Mahadev konar added a comment - Thanks Michi!
        Hide
        Michi Mutsuzaki added a comment -

        I just reverted the change in both trunk and 3.4 branch. I'll move the target version to 3.5.

        Thanks!
        --Michi

        Show
        Michi Mutsuzaki added a comment - I just reverted the change in both trunk and 3.4 branch. I'll move the target version to 3.5. Thanks! --Michi
        Hide
        Mahadev konar added a comment -

        Michi looks like you reverted the patch for trunk. Can you do that for 3.4 branch as well? If not let me know. I can do it.

        Show
        Mahadev konar added a comment - Michi looks like you reverted the patch for trunk. Can you do that for 3.4 branch as well? If not let me know. I can do it.
        Hide
        Mahadev konar added a comment -

        Nice catch Michi. I think Ill revert the patch for 3.4 and trunk and we can fix it later. I dont think this looks like a blocker for 3.4 release. Mitchi what do you think?

        Show
        Mahadev konar added a comment - Nice catch Michi. I think Ill revert the patch for 3.4 and trunk and we can fix it later. I dont think this looks like a blocker for 3.4 release. Mitchi what do you think?
        Hide
        Michi Mutsuzaki added a comment -

        It looks like this patch broke the windows build. The pollfd struct is not available in windows, so we need to check "#ifdef WIN32" and do something else for windows.

        https://builds.apache.org/job/ZooKeeper-trunk-WinVS2008/513/console

        Show
        Michi Mutsuzaki added a comment - It looks like this patch broke the windows build. The pollfd struct is not available in windows, so we need to check "#ifdef WIN32" and do something else for windows. https://builds.apache.org/job/ZooKeeper-trunk-WinVS2008/513/console
        Hide
        Hudson added a comment -

        Integrated in ZooKeeper-trunk #1670 (See https://builds.apache.org/job/ZooKeeper-trunk/1670/)
        ZOOKEEPER-1105 c client zookeeper_close not send CLOSE_OP request to server (lincoln.lee via michim) (Revision 1380931)

        Result = SUCCESS
        michim : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1380931
        Files :

        • /zookeeper/trunk/CHANGES.txt
        • /zookeeper/trunk/src/c/src/zookeeper.c
        Show
        Hudson added a comment - Integrated in ZooKeeper-trunk #1670 (See https://builds.apache.org/job/ZooKeeper-trunk/1670/ ) ZOOKEEPER-1105 c client zookeeper_close not send CLOSE_OP request to server (lincoln.lee via michim) (Revision 1380931) Result = SUCCESS michim : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1380931 Files : /zookeeper/trunk/CHANGES.txt /zookeeper/trunk/src/c/src/zookeeper.c
        Hide
        Michi Mutsuzaki added a comment -

        +1.

        I'll check this in to trunk and the 3.4 branch.

        Thanks!
        --Michi

        Show
        Michi Mutsuzaki added a comment - +1. I'll check this in to trunk and the 3.4 branch. Thanks! --Michi
        Hide
        lincoln.lee added a comment -

        patch for avoid WARN & ERROR logs on zookeeper server when client calls close method

        Show
        lincoln.lee added a comment - patch for avoid WARN & ERROR logs on zookeeper server when client calls close method
        Hide
        lincoln.lee added a comment -

        Hi Patrick, the patch is ready for review, sorry for response so late.

        Show
        lincoln.lee added a comment - Hi Patrick, the patch is ready for review, sorry for response so late.
        Hide
        Patrick Hunt added a comment -

        Hi Lincoln, if you think the patch is ready for review please "submit patch" above, thanks.

        Show
        Patrick Hunt added a comment - Hi Lincoln, if you think the patch is ready for review please "submit patch" above, thanks.
        Hide
        lincoln.lee added a comment -

        fix spelling of comments

        Show
        lincoln.lee added a comment - fix spelling of comments
        Hide
        lincoln.lee added a comment -

        Diego, I've also encountered java.nio.channels.CancelledKeyException in server's log when a python client calls close method, but I found client had sent CLOSE_OP request to server but exit before server can read the request completely,

        src/c/src/zookeeper.c
        @@ -2522,6 +2522,22 @@
        /* make sure the close request is sent; we set timeout to an arbitrary

        • (but reasonable) number of milliseconds since we want the call to block*/
          rc=adaptor_send_queue(zh, 3000); // here actually call send(s, buf, len, MSG_NOSIGNAL)

        we can simply sleep one second after call send_buffer in flush_send_queue method of src/c/src/zookeeper.c
        rc = send_buffer(zh->fd, zh->to_send.head);
        + if (timeout>0)sleep(1);
        if(rc==0 && timeout==0){

        or add a little more lines as the ZOOKEEPER-1105.patch attached (against trunk)

        Show
        lincoln.lee added a comment - Diego, I've also encountered java.nio.channels.CancelledKeyException in server's log when a python client calls close method, but I found client had sent CLOSE_OP request to server but exit before server can read the request completely, src/c/src/zookeeper.c @@ -2522,6 +2522,22 @@ /* make sure the close request is sent; we set timeout to an arbitrary (but reasonable) number of milliseconds since we want the call to block*/ rc=adaptor_send_queue(zh, 3000); // here actually call send(s, buf, len, MSG_NOSIGNAL) we can simply sleep one second after call send_buffer in flush_send_queue method of src/c/src/zookeeper.c rc = send_buffer(zh->fd, zh->to_send.head); + if (timeout>0)sleep(1); if(rc==0 && timeout==0){ or add a little more lines as the ZOOKEEPER-1105 .patch attached (against trunk)
        Hide
        lincoln.lee added a comment -

        resolution for avoid WARN & ERROR logs on zookeeper server when client calls close method

        Show
        lincoln.lee added a comment - resolution for avoid WARN & ERROR logs on zookeeper server when client calls close method
        Hide
        Diego Ongaro added a comment -

        Patrick, I don't have a patch. My attachments help explain the issue but do not fix it.

        Show
        Diego Ongaro added a comment - Patrick, I don't have a patch. My attachments help explain the issue but do not fix it.
        Hide
        Patrick Hunt added a comment -

        Diego, can you submit this in the form of a patch?
        https://cwiki.apache.org/confluence/display/ZOOKEEPER/HowToContribute

        I'm afraid we missed this originally, when you have something you'd like us to review please advance the workflow by clicking on the "submit" link. Thanks!

        Show
        Patrick Hunt added a comment - Diego, can you submit this in the form of a patch? https://cwiki.apache.org/confluence/display/ZOOKEEPER/HowToContribute I'm afraid we missed this originally, when you have something you'd like us to review please advance the workflow by clicking on the "submit" link. Thanks!
        Hide
        Diego Ongaro added a comment -

        I think I've encountered a client-visible manifestation of this issue.

        The ZooKeeper FAQ entry [1] titled "Is there an easy way to expire a session for testing?" explains that when one client closes a session, all other sessions using that same session ID will be expired. This feature appears to work with Java clients but not with C clients.

        I've attached a C and a Java version of the test case. The basic idea is this:

        1) Connect to ZooKeeper.
        2) Create a test znode ("/test") if it doesn't exist.
        3) Write to the znode.
        4) Connect to ZooKeeper with the first connection's session ID and password.
        5) Close this new connection.
        6) Try to write to the znode again on the first connection.

        The write in (6) should never succeed. Using the Java client, the write first returns a ConnectionLossException. Retrying it then returns a SessionExpiredException, as expected. However, when using the C client, the write returns ZOK; this is the wrong behavior.

        I've also attached the relevant parts of the server's log file when running against both the Java client and the C client. They diverge in the way jiang guangran originally reported. The key thing to see in these logs is that, in the C client, the session is not terminated cleanly. The client library transparently reconnects a new socket using the same session ID, and the server accepts the session ID that should have been expired.

        Based on this behavior, I think there may be two issues with the C client library:

        • First, it does not cleanly close its sessions. This prevents C clients from forcefully closing each others' sessions.
        • Second, it attempts to reconnect over a new socket without returning a ZCONNECTIONLOSS to the client. This doesn't matter much to me, but could it affect correctness for some applications?

        If you need more info or clarification, feel free to ask by commenting on this issue. I'm also on IRC as 'ongardie'.

        Cheers,
        Diego

        [1] http://wiki.apache.org/hadoop/ZooKeeper/FAQ#A4

        Show
        Diego Ongaro added a comment - I think I've encountered a client-visible manifestation of this issue. The ZooKeeper FAQ entry [1] titled "Is there an easy way to expire a session for testing?" explains that when one client closes a session, all other sessions using that same session ID will be expired. This feature appears to work with Java clients but not with C clients. I've attached a C and a Java version of the test case. The basic idea is this: 1) Connect to ZooKeeper. 2) Create a test znode ("/test") if it doesn't exist. 3) Write to the znode. 4) Connect to ZooKeeper with the first connection's session ID and password. 5) Close this new connection. 6) Try to write to the znode again on the first connection. The write in (6) should never succeed. Using the Java client, the write first returns a ConnectionLossException. Retrying it then returns a SessionExpiredException, as expected. However, when using the C client, the write returns ZOK; this is the wrong behavior. I've also attached the relevant parts of the server's log file when running against both the Java client and the C client. They diverge in the way jiang guangran originally reported. The key thing to see in these logs is that, in the C client, the session is not terminated cleanly. The client library transparently reconnects a new socket using the same session ID, and the server accepts the session ID that should have been expired. Based on this behavior, I think there may be two issues with the C client library: First, it does not cleanly close its sessions. This prevents C clients from forcefully closing each others' sessions. Second, it attempts to reconnect over a new socket without returning a ZCONNECTIONLOSS to the client. This doesn't matter much to me, but could it affect correctness for some applications? If you need more info or clarification, feel free to ask by commenting on this issue. I'm also on IRC as 'ongardie'. Cheers, Diego [1] http://wiki.apache.org/hadoop/ZooKeeper/FAQ#A4
        Hide
        Diego Ongaro added a comment -

        Server log file containing snippets from running both the Java client and the C client.

        Show
        Diego Ongaro added a comment - Server log file containing snippets from running both the Java client and the C client.
        Hide
        Diego Ongaro added a comment -

        Test program in C (showing incorrect behavior).

        Show
        Diego Ongaro added a comment - Test program in C (showing incorrect behavior).
        Hide
        Diego Ongaro added a comment -

        Test program in Java (showing correct behavior).

        Show
        Diego Ongaro added a comment - Test program in Java (showing correct behavior).

          People

          • Assignee:
            lincoln.lee
            Reporter:
            jiang guangran
          • Votes:
            2 Vote for this issue
            Watchers:
            11 Start watching this issue

            Dates

            • Created:
              Updated:

              Development