ZooKeeper
  1. ZooKeeper
  2. ZOOKEEPER-1560

Zookeeper client hangs on creation of large nodes

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 3.4.4, 3.5.0
    • Fix Version/s: 3.4.5, 3.5.0
    • Component/s: java client
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      To reproduce, try creating a node with 0.5M of data using java client. The test will hang waiting for a response from the server. See the attached patch for the test that reproduces the issue.

      It seems that ZOOKEEPER-1437 introduced a few issues to ClientCnxnSocketNIO.doIO that prevent ClientCnxnSocketNIO from sending large packets that require several invocations of SocketChannel.write to complete. The first issue is that the call to outgoingQueue.removeFirstOccurrence(p); removes the packet from the queue even if the packet wasn't completely sent yet. It looks to me that this call should be moved under if (!pbb.hasRemaining()) The second issue is that p.createBB() is reinitializing ByteBuffer on every iteration, which confuses SocketChannel.write. And the third issue is caused by extra calls to cnxn.getXid() that increment xid on every iteration and confuse the server.

      1. ZOOKEEPER-1560-v8.patch
        6 kB
        Skye Wanderman-Milne
      2. ZOOKEEPER-1560-v8_r4.patch
        8 kB
        Skye Wanderman-Milne
      3. ZOOKEEPER-1560-v8_r3.patch
        7 kB
        Skye Wanderman-Milne
      4. zookeeper-1560-v7.txt
        5 kB
        Ted Yu
      5. zookeeper-1560-v6.txt
        5 kB
        Ted Yu
      6. zookeeper-1560-v5.txt
        4 kB
        Ted Yu
      7. zookeeper-1560-v4.txt
        4 kB
        Ted Yu
      8. zookeeper-1560-v3.txt
        3 kB
        Ted Yu
      9. zookeeper-1560-v2.txt
        3 kB
        Ted Yu
      10. zookeeper-1560-v1.txt
        2 kB
        Ted Yu
      11. ZOOKEEPER-1560.patch
        0.9 kB
        Igor Motov

        Issue Links

          Activity

          Hide
          Hudson added a comment -

          Integrated in ZooKeeper-trunk #1728 (See https://builds.apache.org/job/ZooKeeper-trunk/1728/)
          ZOOKEEPER-1560 Zookeeper client hangs on creation of large nodes. Added missing test. (Skye Wanderman-Milne via phunt) (Revision 1404288)
          ZOOKEEPER-1560 Zookeeper client hangs on creation of large nodes (Skye Wanderman-Milne via phunt) (Revision 1404260)

          Result = SUCCESS
          phunt : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1404288
          Files :

          • /zookeeper/trunk/src/java/test/org/apache/zookeeper/test/ClientTest.java

          phunt : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1404260
          Files :

          • /zookeeper/trunk/src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java
          Show
          Hudson added a comment - Integrated in ZooKeeper-trunk #1728 (See https://builds.apache.org/job/ZooKeeper-trunk/1728/ ) ZOOKEEPER-1560 Zookeeper client hangs on creation of large nodes. Added missing test. (Skye Wanderman-Milne via phunt) (Revision 1404288) ZOOKEEPER-1560 Zookeeper client hangs on creation of large nodes (Skye Wanderman-Milne via phunt) (Revision 1404260) Result = SUCCESS phunt : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1404288 Files : /zookeeper/trunk/src/java/test/org/apache/zookeeper/test/ClientTest.java phunt : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1404260 Files : /zookeeper/trunk/src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java
          Hide
          Patrick Hunt added a comment -

          Committed to branch-3.4 and trunk. Thanks Skye et. al. !

          Show
          Patrick Hunt added a comment - Committed to branch-3.4 and trunk. Thanks Skye et. al. !
          Hide
          Patrick Hunt added a comment -

          Things take time.

          I was waiting for Skye to update the patch to include the missing test. It's failing to apply now because the most recent patch is now partially committed to trunk. Working on it.

          Show
          Patrick Hunt added a comment - Things take time. I was waiting for Skye to update the patch to include the missing test. It's failing to apply now because the most recent patch is now partially committed to trunk. Working on it.
          Hide
          Ted Yu added a comment -

          I think Patrick forgot to announce the integration of your prior patch:

          r1404260 | phunt | 2012-10-31 10:24:54 -0700 (Wed, 31 Oct 2012) | 1 line
          
          ZOOKEEPER-1560 Zookeeper client hangs on creation of large nodes (Skye Wanderman-Milne via phunt)
          
          Show
          Ted Yu added a comment - I think Patrick forgot to announce the integration of your prior patch: r1404260 | phunt | 2012-10-31 10:24:54 -0700 (Wed, 31 Oct 2012) | 1 line ZOOKEEPER-1560 Zookeeper client hangs on creation of large nodes (Skye Wanderman-Milne via phunt)
          Hide
          Skye Wanderman-Milne added a comment -

          Can anyone figure out why QA couldn't apply the patch?

          Show
          Skye Wanderman-Milne added a comment - Can anyone figure out why QA couldn't apply the patch?
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12551585/ZOOKEEPER-1560-v8_r4.patch
          against trunk revision 1404260.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1242//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12551585/ZOOKEEPER-1560-v8_r4.patch against trunk revision 1404260. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1242//console This message is automatically generated.
          Hide
          Skye Wanderman-Milne added a comment -

          Add ClientTest.testLargeNodeData to v8_r3 patch.

          Show
          Skye Wanderman-Milne added a comment - Add ClientTest.testLargeNodeData to v8_r3 patch.
          Hide
          Patrick Hunt added a comment -

          Looks good to me. +1 on the patch.

          Show
          Patrick Hunt added a comment - Looks good to me. +1 on the patch.
          Hide
          Ted Yu added a comment -

          Above explanation makes sense.

          Show
          Ted Yu added a comment - Above explanation makes sense.
          Hide
          Skye Wanderman-Milne added a comment -

          Rather than checking if the xid is 0, the new patch moves the xid code into the if (p.bb == null) block. We're essentially checking whether we've already started writing the packet, and if we haven't, we call createBB and setXid. Since we always do these two actions under the same condition (i.e., when we see a new packet), there's no reason to have two separate checks.

          Show
          Skye Wanderman-Milne added a comment - Rather than checking if the xid is 0, the new patch moves the xid code into the if (p.bb == null) block. We're essentially checking whether we've already started writing the packet, and if we haven't, we call createBB and setXid. Since we always do these two actions under the same condition (i.e., when we see a new packet), there's no reason to have two separate checks.
          Hide
          Ted Yu added a comment -

          Skye provided the latest patch. Assigning to Skye.

          Show
          Ted Yu added a comment - Skye provided the latest patch. Assigning to Skye.
          Hide
          Ted Yu added a comment -

          It was about 'the third issue is caused by extra calls to cnxn.getXid() that increment xid on every iteration and confuse the server.'
          In previous patch, there was check whether Xid is 0 before calling p.requestHeader.setXid().

          Show
          Ted Yu added a comment - It was about 'the third issue is caused by extra calls to cnxn.getXid() that increment xid on every iteration and confuse the server.' In previous patch, there was check whether Xid is 0 before calling p.requestHeader.setXid().
          Hide
          Patrick Hunt added a comment -

          In patch v8, I don't see special handling for cnxn.getXid()

          Hi Ted, I'm not clear, what do you mean by that comment?

          Show
          Patrick Hunt added a comment - In patch v8, I don't see special handling for cnxn.getXid() Hi Ted, I'm not clear, what do you mean by that comment?
          Hide
          Ted Yu added a comment -

          In patch v8, I don't see special handling for cnxn.getXid()
          FYI

          Show
          Ted Yu added a comment - In patch v8, I don't see special handling for cnxn.getXid() FYI
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12551039/ZOOKEEPER-1560-v8_r3.patch
          against trunk revision 1391526.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1241//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1241//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1241//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12551039/ZOOKEEPER-1560-v8_r3.patch against trunk revision 1391526. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1241//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1241//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1241//console This message is automatically generated.
          Hide
          Skye Wanderman-Milne added a comment -

          syncing with review board

          Show
          Skye Wanderman-Milne added a comment - syncing with review board
          Hide
          Ted Yu added a comment -

          I left some minor comments on review board.
          Nice work, Skye.

          Show
          Ted Yu added a comment - I left some minor comments on review board. Nice work, Skye.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12550725/ZOOKEEPER-1560-v8.patch
          against trunk revision 1391526.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1238//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1238//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1238//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12550725/ZOOKEEPER-1560-v8.patch against trunk revision 1391526. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1238//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1238//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1238//console This message is automatically generated.
          Hide
          Skye Wanderman-Milne added a comment -

          I've created a new patch (ZOOKEEPER-1560-v8.patch) that incorporates what we have so far (moving removeFirstOccurrence to after the packet is completely written, only calling createBB when a BB doesn't already exist, and only calling setXid when no xid is already set). It also modifies findSendablePacket to always choose the first packet if it is partially written. The only place that a packet is prepended to outgoingQueue is ClientCnxn.primeConnection, which should only happen at the very beginning, so a partially-written packet should remain at the beginning of the queue until it is removed. I also cleaned up some of the code so the changes look more extensive than they really are Posted at https://reviews.apache.org/r/7730. I added comments to mark the important parts (as opposed to the clean up).

          Show
          Skye Wanderman-Milne added a comment - I've created a new patch ( ZOOKEEPER-1560 -v8.patch) that incorporates what we have so far (moving removeFirstOccurrence to after the packet is completely written, only calling createBB when a BB doesn't already exist, and only calling setXid when no xid is already set). It also modifies findSendablePacket to always choose the first packet if it is partially written. The only place that a packet is prepended to outgoingQueue is ClientCnxn.primeConnection, which should only happen at the very beginning, so a partially-written packet should remain at the beginning of the queue until it is removed. I also cleaned up some of the code so the changes look more extensive than they really are Posted at https://reviews.apache.org/r/7730 . I added comments to mark the important parts (as opposed to the clean up).
          Hide
          Igor Motov added a comment -

          For problem #3, I only found one call to getXid() in doIO:

          p.requestHeader.setXid(cnxn.getXid());
          

          which is not in a loop. Some clarification would be nice.

          It's in the outer loop, so to speak. If the packet is large and is sent in chunks, Xid is incremented for every chunk. Before ZOOKEEPER-1437 it was incremented for every packet.

          Show
          Igor Motov added a comment - For problem #3, I only found one call to getXid() in doIO: p.requestHeader.setXid(cnxn.getXid()); which is not in a loop. Some clarification would be nice. It's in the outer loop, so to speak. If the packet is large and is sent in chunks, Xid is incremented for every chunk. Before ZOOKEEPER-1437 it was incremented for every packet.
          Hide
          Ted Yu added a comment -

          I got the following based on the above code snippet:

          Index: src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java
          ===================================================================
          --- src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java	(revision 1401904)
          +++ src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java	(working copy)
          @@ -111,18 +111,18 @@
                                   cnxn.sendThread.clientTunneledAuthenticationInProgress());
           
                           if (p != null) {
          -                    outgoingQueue.removeFirstOccurrence(p);
                               updateLastSend();
                               if ((p.requestHeader != null) &&
                                       (p.requestHeader.getType() != OpCode.ping) &&
                                       (p.requestHeader.getType() != OpCode.auth)) {
                                   p.requestHeader.setXid(cnxn.getXid());
                               }
          -                    p.createBB();
          +                    if (p.bb == null) p.createBB();
                               ByteBuffer pbb = p.bb;
                               sock.write(pbb);
                               if (!pbb.hasRemaining()) {
                                   sentCount++;
          +                        outgoingQueue.removeFirstOccurrence(p);
                                   if (p.requestHeader != null
                                           && p.requestHeader.getType() != OpCode.ping
                                           && p.requestHeader.getType() != OpCode.auth) {
          @@ -141,8 +141,12 @@
                       synchronized(pendingQueue) {
                           pendingQueue.addAll(pending);
                       }
          -
                   }
          +        if (outgoingQueue.isEmpty()) {
          +          disableWrite();
          +        } else {
          +            enableWrite();
          +        }
               }
           
               private Packet findSendablePacket(LinkedList<Packet> outgoingQueue,
          

          I still saw testLargeNodeData fail:

          Testcase: testLargeNodeData took 0.714 sec
            Caused an ERROR
          KeeperErrorCode = ConnectionLoss for /large
          org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /large
            at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
            at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
            at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
            at org.apache.zookeeper.test.ClientTest.testLargeNodeData(ClientTest.java:61)
          
          Show
          Ted Yu added a comment - I got the following based on the above code snippet: Index: src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java =================================================================== --- src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java (revision 1401904) +++ src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java (working copy) @@ -111,18 +111,18 @@ cnxn.sendThread.clientTunneledAuthenticationInProgress()); if (p != null ) { - outgoingQueue.removeFirstOccurrence(p); updateLastSend(); if ((p.requestHeader != null ) && (p.requestHeader.getType() != OpCode.ping) && (p.requestHeader.getType() != OpCode.auth)) { p.requestHeader.setXid(cnxn.getXid()); } - p.createBB(); + if (p.bb == null ) p.createBB(); ByteBuffer pbb = p.bb; sock.write(pbb); if (!pbb.hasRemaining()) { sentCount++; + outgoingQueue.removeFirstOccurrence(p); if (p.requestHeader != null && p.requestHeader.getType() != OpCode.ping && p.requestHeader.getType() != OpCode.auth) { @@ -141,8 +141,12 @@ synchronized (pendingQueue) { pendingQueue.addAll(pending); } - } + if (outgoingQueue.isEmpty()) { + disableWrite(); + } else { + enableWrite(); + } } private Packet findSendablePacket(LinkedList<Packet> outgoingQueue, I still saw testLargeNodeData fail: Testcase: testLargeNodeData took 0.714 sec Caused an ERROR KeeperErrorCode = ConnectionLoss for /large org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /large at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783) at org.apache.zookeeper.test.ClientTest.testLargeNodeData(ClientTest.java:61)
          Hide
          Patrick Hunt added a comment -

          PDH: similar to what it was before - write as much as possible and then use the selector to wait for the socket to become writeable again

          Ted: I looked at svn log for ClientCnxnSocketNIO.java back to 2011-04-12 and didn't seem to find the above change. FYI

          Hi Ted, the following is what I was referring to. This is from latest on branch-3.3, 3.4.4 has a similar (although broken) block where it's a bit less obvious what's happening. branch-3.3 is more clear.

          Notice that we first attempt to write, if !remaining them we remove from the queue, otw we'll wait till the next time the selector wakes us up (the final isempty check is pretty criticial here as well to set interest correctly) and retry until the buffer is drained.

                      if (sockKey.isWritable()) {
                          synchronized (outgoingQueue) {
                              if (!outgoingQueue.isEmpty()) {
                                  ByteBuffer pbb = outgoingQueue.getFirst().bb;
                                  sock.write(pbb);
                                  if (!pbb.hasRemaining()) {
                                      sentCount++;
                                      Packet p = outgoingQueue.removeFirst();
                                      if (p.header != null
                                              && p.header.getType() != OpCode.ping
                                              && p.header.getType() != OpCode.auth) {
                                          pendingQueue.add(p);
                                      }
                                  }
                              }
                          }
                      }
                      if (outgoingQueue.isEmpty()) {
                          disableWrite();
                      } else {
                          enableWrite();
                      }
          
          Show
          Patrick Hunt added a comment - PDH: similar to what it was before - write as much as possible and then use the selector to wait for the socket to become writeable again Ted: I looked at svn log for ClientCnxnSocketNIO.java back to 2011-04-12 and didn't seem to find the above change. FYI Hi Ted, the following is what I was referring to. This is from latest on branch-3.3, 3.4.4 has a similar (although broken) block where it's a bit less obvious what's happening. branch-3.3 is more clear. Notice that we first attempt to write, if !remaining them we remove from the queue, otw we'll wait till the next time the selector wakes us up (the final isempty check is pretty criticial here as well to set interest correctly) and retry until the buffer is drained. if (sockKey.isWritable()) { synchronized (outgoingQueue) { if (!outgoingQueue.isEmpty()) { ByteBuffer pbb = outgoingQueue.getFirst().bb; sock.write(pbb); if (!pbb.hasRemaining()) { sentCount++; Packet p = outgoingQueue.removeFirst(); if (p.header != null && p.header.getType() != OpCode.ping && p.header.getType() != OpCode.auth) { pendingQueue.add(p); } } } } } if (outgoingQueue.isEmpty()) { disableWrite(); } else { enableWrite(); }
          Hide
          Ted Yu added a comment -

          Should we remove the Packet p and enqueue it to the tail of outgoingQueue if there is data pending ?

          The answer to my own question is no.
          This would very likely result in connection loss / timeout when node data size is large.

          Show
          Ted Yu added a comment - Should we remove the Packet p and enqueue it to the tail of outgoingQueue if there is data pending ? The answer to my own question is no. This would very likely result in connection loss / timeout when node data size is large.
          Hide
          Ted Yu added a comment -

          ClientTest normally takes 36 seconds.
          I ran ClientTest, twice, with Nikita's change:

              [junit] Running org.apache.zookeeper.test.ClientTest
              [junit] Tests run: 12, Failures: 12, Errors: 0, Time elapsed: 340.748 sec
              [junit] Test org.apache.zookeeper.test.ClientTest FAILED
          

          I then tried the following:

          Index: src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java
          ===================================================================
          --- src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java (revision 1401900)
          +++ src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java (working copy)
          @@ -111,17 +111,20 @@
                                   cnxn.sendThread.clientTunneledAuthenticationInProgress());
          
                           if (p != null) {
          -                    outgoingQueue.removeFirstOccurrence(p);
                               updateLastSend();
                               if ((p.requestHeader != null) &&
          -                            (p.requestHeader.getType() != OpCode.ping) &&
          -                            (p.requestHeader.getType() != OpCode.auth)) {
          -                        p.requestHeader.setXid(cnxn.getXid());
          +                        (p.requestHeader.getType() != OpCode.ping) &&
          +                        (p.requestHeader.getType() != OpCode.auth)) {
          +                      p.requestHeader.setXid(cnxn.getXid());
                               }
          -                    p.createBB();
          +                    if (p.bb == null) {
          +                        p.createBB();
          +                        // otherwise we're in the middle of sending packet
          +                    }
                               ByteBuffer pbb = p.bb;
                               sock.write(pbb);
                               if (!pbb.hasRemaining()) {
          +                        outgoingQueue.removeFirstOccurrence(p);
                                   sentCount++;
                                   if (p.requestHeader != null
                                           && p.requestHeader.getType() != OpCode.ping
          

          testLargeNodeData encountered connection loss.

          Show
          Ted Yu added a comment - ClientTest normally takes 36 seconds. I ran ClientTest, twice, with Nikita's change: [junit] Running org.apache.zookeeper.test.ClientTest [junit] Tests run: 12, Failures: 12, Errors: 0, Time elapsed: 340.748 sec [junit] Test org.apache.zookeeper.test.ClientTest FAILED I then tried the following: Index: src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java =================================================================== --- src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java (revision 1401900) +++ src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java (working copy) @@ -111,17 +111,20 @@ cnxn.sendThread.clientTunneledAuthenticationInProgress()); if (p != null ) { - outgoingQueue.removeFirstOccurrence(p); updateLastSend(); if ((p.requestHeader != null ) && - (p.requestHeader.getType() != OpCode.ping) && - (p.requestHeader.getType() != OpCode.auth)) { - p.requestHeader.setXid(cnxn.getXid()); + (p.requestHeader.getType() != OpCode.ping) && + (p.requestHeader.getType() != OpCode.auth)) { + p.requestHeader.setXid(cnxn.getXid()); } - p.createBB(); + if (p.bb == null ) { + p.createBB(); + // otherwise we're in the middle of sending packet + } ByteBuffer pbb = p.bb; sock.write(pbb); if (!pbb.hasRemaining()) { + outgoingQueue.removeFirstOccurrence(p); sentCount++; if (p.requestHeader != null && p.requestHeader.getType() != OpCode.ping testLargeNodeData encountered connection loss.
          Hide
          Ted Yu added a comment -

          similar to what it was before - write as much as possible and then use the selector to wait for the socket to become writeable again

          I looked at svn log for ClientCnxnSocketNIO.java back to 2011-04-12 and didn't seem to find the above change.
          FYI

          Show
          Ted Yu added a comment - similar to what it was before - write as much as possible and then use the selector to wait for the socket to become writeable again I looked at svn log for ClientCnxnSocketNIO.java back to 2011-04-12 and didn't seem to find the above change. FYI
          Hide
          Ted Yu added a comment -

          Looking at createBB(), upon exit the field bb wouldn't be null.
          I wonder why p.createBB() is enclosed in the if (p.bb != null) block above ?

          Show
          Ted Yu added a comment - Looking at createBB(), upon exit the field bb wouldn't be null. I wonder why p.createBB() is enclosed in the if (p.bb != null) block above ?
          Hide
          Nikita Vetoshkin added a comment -

          If no one can prepend outgoingQueue with packet, straightforward implementation like this should work:

          diff --git a/src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java b/src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java
          index 70d8538..457c8cc 100644
          --- a/src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java
          +++ b/src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java
          @@ -111,17 +111,20 @@ public class ClientCnxnSocketNIO extends ClientCnxnSocket {
                                   cnxn.sendThread.clientTunneledAuthenticationInProgress());
           
                           if (p != null) {
          -                    outgoingQueue.removeFirstOccurrence(p);
                               updateLastSend();
          -                    if ((p.requestHeader != null) &&
          -                            (p.requestHeader.getType() != OpCode.ping) &&
          -                            (p.requestHeader.getType() != OpCode.auth)) {
          -                        p.requestHeader.setXid(cnxn.getXid());
          +                    if (p.bb != null) {
          +                        if ((p.requestHeader != null) &&
          +                                (p.requestHeader.getType() != OpCode.ping) &&
          +                                (p.requestHeader.getType() != OpCode.auth)) {
          +                            p.requestHeader.setXid(cnxn.getXid());
          +                        }
          +                        p.createBB();
          +                    // otherwise we're in the middle of sending packet
                               }
          -                    p.createBB();
                               ByteBuffer pbb = p.bb;
                               sock.write(pbb);
                               if (!pbb.hasRemaining()) {
          +                        outgoingQueue.removeFirstOccurrence(p);
                                   sentCount++;
                                   if (p.requestHeader != null
                                           && p.requestHeader.getType() != OpCode.ping
          
          
          Show
          Nikita Vetoshkin added a comment - If no one can prepend outgoingQueue with packet, straightforward implementation like this should work: diff --git a/src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java b/src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java index 70d8538..457c8cc 100644 --- a/src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java +++ b/src/java/main/org/apache/zookeeper/ClientCnxnSocketNIO.java @@ -111,17 +111,20 @@ public class ClientCnxnSocketNIO extends ClientCnxnSocket { cnxn.sendThread.clientTunneledAuthenticationInProgress()); if (p != null) { - outgoingQueue.removeFirstOccurrence(p); updateLastSend(); - if ((p.requestHeader != null) && - (p.requestHeader.getType() != OpCode.ping) && - (p.requestHeader.getType() != OpCode.auth)) { - p.requestHeader.setXid(cnxn.getXid()); + if (p.bb != null) { + if ((p.requestHeader != null) && + (p.requestHeader.getType() != OpCode.ping) && + (p.requestHeader.getType() != OpCode.auth)) { + p.requestHeader.setXid(cnxn.getXid()); + } + p.createBB(); + // otherwise we're in the middle of sending packet } - p.createBB(); ByteBuffer pbb = p.bb; sock.write(pbb); if (!pbb.hasRemaining()) { + outgoingQueue.removeFirstOccurrence(p); sentCount++; if (p.requestHeader != null && p.requestHeader.getType() != OpCode.ping
          Hide
          Patrick Hunt added a comment -

          That's a good point, the while loop in the patch seems like it would block when the tcp buffer is full (e.g. if the server is slow to read). I don't think that's a good idea. Rather we should have the code structured similar to what it was before - write as much as possible and then use the selector to wait for the socket to become writeable again. Eventually the send buffer will drain and we can remove it from the queue.

          Show
          Patrick Hunt added a comment - That's a good point, the while loop in the patch seems like it would block when the tcp buffer is full (e.g. if the server is slow to read). I don't think that's a good idea. Rather we should have the code structured similar to what it was before - write as much as possible and then use the selector to wait for the socket to become writeable again. Eventually the send buffer will drain and we can remove it from the queue.
          Hide
          Ted Yu added a comment -

          For problem #3, I only found one call to getXid() in doIO:

                                  p.requestHeader.setXid(cnxn.getXid());
          

          which is not in a loop. Some clarification would be nice.

          Show
          Ted Yu added a comment - For problem #3, I only found one call to getXid() in doIO: p.requestHeader.setXid(cnxn.getXid()); which is not in a loop. Some clarification would be nice.
          Hide
          Skye Wanderman-Milne added a comment -

          The benefit of one socket write per invocation is that it frees up the SendThread to perform other actions in between writes. For instance, it looks like the SendThread is also response for reading incoming data in doIO. I'm not sure what the consequences of having the thread block until the whole packet is written are; it might be ok, but if we can avoid looping it might save us some trouble later.

          The next invocation of doIO should continue writing any remaining data from a previous packet (we don't want to mix up data from different packets). However, I'm not sure the current code will do this correctly – I think it's possible for another packet to be pushed onto the queue while another packet is in the middle of being written. I'm going to look into this further.

          Show
          Skye Wanderman-Milne added a comment - The benefit of one socket write per invocation is that it frees up the SendThread to perform other actions in between writes. For instance, it looks like the SendThread is also response for reading incoming data in doIO. I'm not sure what the consequences of having the thread block until the whole packet is written are; it might be ok, but if we can avoid looping it might save us some trouble later. The next invocation of doIO should continue writing any remaining data from a previous packet (we don't want to mix up data from different packets). However, I'm not sure the current code will do this correctly – I think it's possible for another packet to be pushed onto the queue while another packet is in the middle of being written. I'm going to look into this further.
          Hide
          Ted Yu added a comment -

          Thanks for the review.
          I agree chunking is not needed.

          it would be safer to take the current approach of only doing one socket write per invocation of doIO

          If we leave Packet p in the outgoingQueue if there is data pending to be written and return from doIO, would the next invocation of doIO continue writing the remaining data from the same Packet ?
          If so, what is the benefit of one socket write per invocation ?
          Should we remove the Packet p and enqueue it to the tail of outgoingQueue if there is data pending ?

          Show
          Ted Yu added a comment - Thanks for the review. I agree chunking is not needed. it would be safer to take the current approach of only doing one socket write per invocation of doIO If we leave Packet p in the outgoingQueue if there is data pending to be written and return from doIO, would the next invocation of doIO continue writing the remaining data from the same Packet ? If so, what is the benefit of one socket write per invocation ? Should we remove the Packet p and enqueue it to the tail of outgoingQueue if there is data pending ?
          Hide
          Skye Wanderman-Milne added a comment -

          A couple questions/concerns:

          • Is it safe to write out the whole packet in a while loop? Doing so essentially causes the thread to block until everything is written. This seems pretty dangerous to me since there's only a single SendThread – I think it would be safer to take the current approach of only doing one socket write per invocation of doIO. We would still need to address issue 1 in the description (packets are being removed from the queue even if they weren't fully written).
          • What is the purpose of chunking the data? Why not write as much as possible?
          • This was commented on earlier, but what's going on with issues 2 and 3 from the description? Addressing only issue 1 seems to fix the problem of sending large packets.
          Show
          Skye Wanderman-Milne added a comment - A couple questions/concerns: Is it safe to write out the whole packet in a while loop? Doing so essentially causes the thread to block until everything is written. This seems pretty dangerous to me since there's only a single SendThread – I think it would be safer to take the current approach of only doing one socket write per invocation of doIO. We would still need to address issue 1 in the description (packets are being removed from the queue even if they weren't fully written). What is the purpose of chunking the data? Why not write as much as possible? This was commented on earlier, but what's going on with issues 2 and 3 from the description? Addressing only issue 1 seems to fix the problem of sending large packets.
          Hide
          Ted Yu added a comment -

          Good news was that patch v7 passed.
          Not so good news was that I didn't find any occurrence of the warning message I added in v7.

          Essentially patch v7 is the same as patch v2 - we shouldn't bail if a single sock.write() call didn't make progress.

          Show
          Ted Yu added a comment - Good news was that patch v7 passed. Not so good news was that I didn't find any occurrence of the warning message I added in v7. Essentially patch v7 is the same as patch v2 - we shouldn't bail if a single sock.write() call didn't make progress.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12548908/zookeeper-1560-v7.txt
          against trunk revision 1391526.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1218//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1218//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1218//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12548908/zookeeper-1560-v7.txt against trunk revision 1391526. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1218//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1218//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1218//console This message is automatically generated.
          Hide
          Ted Yu added a comment -

          Patch v7 changes the IOE to a warning.
          Let's see if the test is able to make further progress.

          I wonder whether 77152 bytes would be big enough for most use cases.

          Show
          Ted Yu added a comment - Patch v7 changes the IOE to a warning. Let's see if the test is able to make further progress. I wonder whether 77152 bytes would be big enough for most use cases.
          Hide
          Eugene Koontz added a comment -

          It seems like in a particular iteration, 0 bytes is written:

          localhost/127.0.0.1:11222, unexpected error, closing socket connection and attempting reconnect
               [exec]     [junit] java.io.IOException: Couldn't write 2000 bytes, 0 bytes written in this iteration and 77152 bytes written in total. Original limit: 500074
               [exec]     [junit] 	at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:145)
               [exec]     [junit] 	at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:375)
               [exec]     [junit] 	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
               [exec]     [junit] 2012-10-12 15:20:42,629 [myid:] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11222:NIOServerCnxn@349] - caught end of stream exception
               [exec]     [junit] EndOfStreamException: Unable to read additional data from client sessionid 0x13a55902b650001, likely client has closed socket
               [exec]     [junit] 	at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:220)
               [exec]     [junit] 	at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
               [exec]     [junit] 	at java.lang.Thread.run(Thread.java:662)
               [exec]     [junit] 2012-10-12 15:20:42,630 [myid:] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11222:NIOServerCnxn@1001] - Closed socket connection for client /127.0.0.1:57126 which had sessionid 0x13a55902b650001
          

          Seems like there's a strange resemblance among all the test failures thus far: always fails after 77152 bytes written.

          Show
          Eugene Koontz added a comment - It seems like in a particular iteration, 0 bytes is written: localhost/127.0.0.1:11222, unexpected error, closing socket connection and attempting reconnect [exec] [junit] java.io.IOException: Couldn't write 2000 bytes, 0 bytes written in this iteration and 77152 bytes written in total. Original limit: 500074 [exec] [junit] at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:145) [exec] [junit] at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:375) [exec] [junit] at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068) [exec] [junit] 2012-10-12 15:20:42,629 [myid:] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11222:NIOServerCnxn@349] - caught end of stream exception [exec] [junit] EndOfStreamException: Unable to read additional data from client sessionid 0x13a55902b650001, likely client has closed socket [exec] [junit] at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:220) [exec] [junit] at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) [exec] [junit] at java.lang. Thread .run( Thread .java:662) [exec] [junit] 2012-10-12 15:20:42,630 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11222:NIOServerCnxn@1001] - Closed socket connection for client /127.0.0.1:57126 which had sessionid 0x13a55902b650001 Seems like there's a strange resemblance among all the test failures thus far: always fails after 77152 bytes written.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12548898/zookeeper-1560-v6.txt
          against trunk revision 1391526.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1217//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1217//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1217//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12548898/zookeeper-1560-v6.txt against trunk revision 1391526. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1217//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1217//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1217//console This message is automatically generated.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12548893/zookeeper-1560-v5.txt
          against trunk revision 1391526.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1216//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1216//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1216//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12548893/zookeeper-1560-v5.txt against trunk revision 1391526. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1216//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1216//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1216//console This message is automatically generated.
          Hide
          Ted Yu added a comment -

          Patch v6 changes the condition for raising IOE: if there is no progress between successive sock.write() calls.

          I guess socket's output buffer might be a limiting factor as to the number of bytes written in a particular sock.write() call.

          Show
          Ted Yu added a comment - Patch v6 changes the condition for raising IOE: if there is no progress between successive sock.write() calls. I guess socket's output buffer might be a limiting factor as to the number of bytes written in a particular sock.write() call.
          Hide
          Ted Yu added a comment -

          From https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1215//testReport/org.apache.zookeeper.test/ClientTest/testLargeNodeData/ :

          2012-10-12 14:10:50,042 [myid:] - WARN  [main-SendThread(localhost:11221):ClientCnxn$SendThread@1089] - Session 0x13a555031cf0000 for server localhost/127.0.0.1:11221, unexpected error, closing socket connection and attempting reconnect
          java.io.IOException: Couldn't write 2000 bytes, 1152 bytes written
          	at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:142)
          	at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370)
          	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
          2012-10-12 14:10:50,044 [myid:] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxn@349] - caught end of stream exception
          EndOfStreamException: Unable to read additional data from client sessionid 0x13a555031cf0000, likely client has closed socket
          	at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:220)
          	at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
          	at java.lang.Thread.run(Thread.java:662)
          

          Patch v5 adds more information to exception message.

          Show
          Ted Yu added a comment - From https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1215//testReport/org.apache.zookeeper.test/ClientTest/testLargeNodeData/ : 2012-10-12 14:10:50,042 [myid:] - WARN [main-SendThread(localhost:11221):ClientCnxn$SendThread@1089] - Session 0x13a555031cf0000 for server localhost/127.0.0.1:11221, unexpected error, closing socket connection and attempting reconnect java.io.IOException: Couldn't write 2000 bytes, 1152 bytes written at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:142) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068) 2012-10-12 14:10:50,044 [myid:] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxn@349] - caught end of stream exception EndOfStreamException: Unable to read additional data from client sessionid 0x13a555031cf0000, likely client has closed socket at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:220) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) at java.lang. Thread .run( Thread .java:662) Patch v5 adds more information to exception message.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12548889/zookeeper-1560-v4.txt
          against trunk revision 1391526.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1215//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1215//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1215//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12548889/zookeeper-1560-v4.txt against trunk revision 1391526. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1215//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1215//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1215//console This message is automatically generated.
          Hide
          Ted Yu added a comment -

          Patch v4 adopts what Igor and Jacky suggested.

          Since Packet p is guaranteed to send all bytes in its ByteBuffer, I think problem 2 no longer exists.

          ChrootClientTest and ClientTest both passed.

          Show
          Ted Yu added a comment - Patch v4 adopts what Igor and Jacky suggested. Since Packet p is guaranteed to send all bytes in its ByteBuffer, I think problem 2 no longer exists. ChrootClientTest and ClientTest both passed.
          Hide
          Jacky007 added a comment -

          I think this would work for both 1560 and 1561.

           if (p != null) {
                              updateLastSend();
                              if ((p.requestHeader != null) &&
                                      (p.requestHeader.getType() != OpCode.ping) &&
                                      (p.requestHeader.getType() != OpCode.auth)) {
                                  p.requestHeader.setXid(cnxn.getXid());
                              }
                              p.createBB();
                              ByteBuffer pbb = p.bb;
                    --->   while (pbb.hasRemaining()) sock.write(pbb);
                    --->   outgoingQueue.removeFirstOccurrence(p);
                              sentCount++;
                              if (p.requestHeader != null
                                      && p.requestHeader.getType() != OpCode.ping
                                      && p.requestHeader.getType() != OpCode.auth) {
                                  pending.add(p);
                              }
                 }
          
          Show
          Jacky007 added a comment - I think this would work for both 1560 and 1561. if (p != null) { updateLastSend(); if ((p.requestHeader != null) && (p.requestHeader.getType() != OpCode.ping) && (p.requestHeader.getType() != OpCode.auth)) { p.requestHeader.setXid(cnxn.getXid()); } p.createBB(); ByteBuffer pbb = p.bb; ---> while (pbb.hasRemaining()) sock.write(pbb); ---> outgoingQueue.removeFirstOccurrence(p); sentCount++; if (p.requestHeader != null && p.requestHeader.getType() != OpCode.ping && p.requestHeader.getType() != OpCode.auth) { pending.add(p); } }
          Hide
          Alexander Shraer added a comment -

          Hi Ted,

          instead of int expectedSize = pbb.limit() - pbb.position();
          you can use pbb.remaining().

          BTW, the JIRA description mentions several problems with the same block of code.
          The first one of them (dequeuing before we know sending succeeded) also causes ZK-1561.
          Please correct me if I'm wrong but seems like the current patch doesn't address the 3 problems
          mentioned in the description of this JIRA.

          Thanks,
          Alex

          Show
          Alexander Shraer added a comment - Hi Ted, instead of int expectedSize = pbb.limit() - pbb.position(); you can use pbb.remaining(). BTW, the JIRA description mentions several problems with the same block of code. The first one of them (dequeuing before we know sending succeeded) also causes ZK-1561. Please correct me if I'm wrong but seems like the current patch doesn't address the 3 problems mentioned in the description of this JIRA. Thanks, Alex
          Hide
          Ted Yu added a comment -

          I couldn't reproduce the error in test.ClientTest
          Looking at
          http://docs.oracle.com/javase/6/docs/api/java/nio/channels/SocketChannel.html#write%28java.nio.ByteBuffer%29

          Unless otherwise specified, a write operation will return only after writing all of the r requested bytes.

          I think the additional check in patch v3 may not be needed

          Show
          Ted Yu added a comment - I couldn't reproduce the error in test.ClientTest Looking at http://docs.oracle.com/javase/6/docs/api/java/nio/channels/SocketChannel.html#write%28java.nio.ByteBuffer%29 Unless otherwise specified, a write operation will return only after writing all of the r requested bytes. I think the additional check in patch v3 may not be needed
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12548839/zookeeper-1560-v3.txt
          against trunk revision 1391526.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1214//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1214//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1214//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12548839/zookeeper-1560-v3.txt against trunk revision 1391526. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1214//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1214//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1214//console This message is automatically generated.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12548828/zookeeper-1560-v2.txt
          against trunk revision 1391526.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1213//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1213//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1213//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12548828/zookeeper-1560-v2.txt against trunk revision 1391526. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1213//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1213//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1213//console This message is automatically generated.
          Hide
          Ted Yu added a comment -

          Patch v3 adds the check for number of bytes written.

          ClientTest passed.

          Show
          Ted Yu added a comment - Patch v3 adds the check for number of bytes written. ClientTest passed.
          Hide
          Alexander Shraer added a comment -

          maybe IOException since this is what sock.write may throw anyway.

          Show
          Alexander Shraer added a comment - maybe IOException since this is what sock.write may throw anyway.
          Hide
          Ted Yu added a comment -

          What exception should be raised if not all data is written ?
          Thanks

          Show
          Ted Yu added a comment - What exception should be raised if not all data is written ? Thanks
          Hide
          Eugene Koontz added a comment -

          @Ted, your patch looks good as far as configurability/final-ness.

          Show
          Eugene Koontz added a comment - @Ted, your patch looks good as far as configurability/final-ness.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12548823/zookeeper-1560-v1.txt
          against trunk revision 1391526.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1212//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1212//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1212//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12548823/zookeeper-1560-v1.txt against trunk revision 1391526. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1212//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1212//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1212//console This message is automatically generated.
          Hide
          Alexander Shraer added a comment -

          Hi Ted,

          why not test for the return value of sock.write ?

          perhaps if it couldn't write the entire chunk size we can raise an exception.

          Show
          Alexander Shraer added a comment - Hi Ted, why not test for the return value of sock.write ? perhaps if it couldn't write the entire chunk size we can raise an exception.
          Hide
          Ted Yu added a comment -

          Thanks Eugene for the hint.

          Here is patch v2 which passed ClientTest as well.

          Show
          Ted Yu added a comment - Thanks Eugene for the hint. Here is patch v2 which passed ClientTest as well.
          Hide
          Eugene Koontz added a comment -

          Hi Ted,
          For making the chunk size configurable, I think something like this would work: (add to same file as you are patching (ClientCnxnSocketNIO.java))

          public static final String CLIENT_WRITE_CHUNK_SIZE = "zookeeper.client.nio.writechunk";
          
          ...
          
          final int CHUNK_SIZE = getProperty(CLIENT_WRITE_CHUNK_SIZE,2000);
          
          
          Show
          Eugene Koontz added a comment - Hi Ted, For making the chunk size configurable, I think something like this would work: (add to same file as you are patching (ClientCnxnSocketNIO.java)) public static final String CLIENT_WRITE_CHUNK_SIZE = "zookeeper.client.nio.writechunk" ; ... final int CHUNK_SIZE = getProperty(CLIENT_WRITE_CHUNK_SIZE,2000);
          Hide
          Ted Yu added a comment -

          Let QA run the tests

          Show
          Ted Yu added a comment - Let QA run the tests
          Hide
          Ted Yu added a comment -

          should CHUNK_SIZE be a configurable setting

          If I knew how to access Configuration instance from within ClientCnxnSocketNIO

          originalLimit can be final

          Sure. Will incorporate into next patch after I get more review comments.

          Show
          Ted Yu added a comment - should CHUNK_SIZE be a configurable setting If I knew how to access Configuration instance from within ClientCnxnSocketNIO originalLimit can be final Sure. Will incorporate into next patch after I get more review comments.
          Hide
          Eugene Koontz added a comment -

          Hi Ted, I think originalLimit can be final, just as CHUNK_SIZE is. Also should CHUNK_SIZE be a configurable setting, defaulting to 2000?

          Show
          Eugene Koontz added a comment - Hi Ted, I think originalLimit can be final, just as CHUNK_SIZE is. Also should CHUNK_SIZE be a configurable setting, defaulting to 2000?
          Hide
          Ted Yu added a comment -

          junit.run:
          [junit] Running org.apache.zookeeper.test.ClientTest
          [junit] Tests run: 12, Failures: 0, Errors: 0, Time elapsed: 17.694 sec

          Show
          Ted Yu added a comment - junit.run: [junit] Running org.apache.zookeeper.test.ClientTest [junit] Tests run: 12, Failures: 0, Errors: 0, Time elapsed: 17.694 sec
          Hide
          Ted Yu added a comment -

          Patch v1 implements the chunk'ed write.

          ClientTest passes.

          Please provide your comments.

          Show
          Ted Yu added a comment - Patch v1 implements the chunk'ed write. ClientTest passes. Please provide your comments.
          Hide
          Ted Yu added a comment -

          In doIO(), should we check the return value from:

                              sock.write(pbb);
          

          Here is jstack where testLargeNodeData hung:

          "main" prio=5 tid=7f9bed000800 nid=0x10c382000 in Object.wait() [10c380000]
             java.lang.Thread.State: WAITING (on object monitor)
          	at java.lang.Object.wait(Native Method)
          	- waiting on <7dc1a15c0> (a org.apache.zookeeper.ClientCnxn$Packet)
          	at java.lang.Object.wait(Object.java:485)
          	at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1309)
          	- locked <7dc1a15c0> (a org.apache.zookeeper.ClientCnxn$Packet)
          	at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:781)
          	at org.apache.zookeeper.test.ClientTest.testLargeNodeData(ClientTest.java:531)
          

          I think we can send data in chunks if pbb.remaining() is beyond certain threshold.

          Show
          Ted Yu added a comment - In doIO(), should we check the return value from: sock.write(pbb); Here is jstack where testLargeNodeData hung: "main" prio=5 tid=7f9bed000800 nid=0x10c382000 in Object .wait() [10c380000] java.lang. Thread .State: WAITING (on object monitor) at java.lang. Object .wait(Native Method) - waiting on <7dc1a15c0> (a org.apache.zookeeper.ClientCnxn$Packet) at java.lang. Object .wait( Object .java:485) at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1309) - locked <7dc1a15c0> (a org.apache.zookeeper.ClientCnxn$Packet) at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:781) at org.apache.zookeeper.test.ClientTest.testLargeNodeData(ClientTest.java:531) I think we can send data in chunks if pbb.remaining() is beyond certain threshold.
          Hide
          Alexander Shraer added a comment -

          Please also see ZOOKEEPER-1561

          Show
          Alexander Shraer added a comment - Please also see ZOOKEEPER-1561
          Hide
          Igor Motov added a comment -

          Test that reproduces that issue.

          Show
          Igor Motov added a comment - Test that reproduces that issue.

            People

            • Assignee:
              Skye Wanderman-Milne
              Reporter:
              Igor Motov
            • Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development