ZooKeeper
  1. ZooKeeper
  2. ZOOKEEPER-1197

Incorrect socket handling of 4 letter words for NIO

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Critical Critical
    • Resolution: Unresolved
    • Affects Version/s: 3.3.3, 3.4.0
    • Fix Version/s: 3.5.0
    • Component/s: server
    • Labels:
      None

      Description

      When transferring a large amount of information from a 4 letter word, especially in interactive mode (telnet or nc) over a slower network link, the connection can be closed before all of the data has reached the client. This is due to the way we handle nc non-interactive mode, by cancelling the selector key.

      Instead of cancelling the selector key for 4-letter-words, we should instead flag the NIOServerCnxn to ignore detection of a close condition on that socket (CancelledKeyException, EndOfStreamException). Since the 4lw will close the connection immediately upon completion, this should be safe to do.

      See ZOOKEEPER-737 for more details

      1. ZOOKEEPER-1197.patch
        7 kB
        Camille Fournier
      2. ZOOKEEPER-1197_shutdown.patch
        1 kB
        Skye Wanderman-Milne
      3. long_4lw.patch
        2 kB
        Skye Wanderman-Milne

        Issue Links

          Activity

          Patrick Hunt made changes -
          Link This issue relates to ZOOKEEPER-805 [ ZOOKEEPER-805 ]
          Hide
          Patrick Hunt added a comment -

          afaik this is not a solvable problem short of implementing the client/server side properly - ala ZOOKEEPER-1346

          Should we close this? (as unfixable)

          Show
          Patrick Hunt added a comment - afaik this is not a solvable problem short of implementing the client/server side properly - ala ZOOKEEPER-1346 Should we close this? (as unfixable)
          Hide
          Skye Wanderman-Milne added a comment -

          Based on the second code snippet in the link Patrick found, I've written an alternate patch (ZOOKEEPER-1197_shutdown.patch) that waits for the client to shutdown the input connection before closing the socket.

          I've managed to reproduce the problem locally for testing:
          0. My base setup: trunk zk checkout, Ubuntu 12.04, Oracle Java 6, no fancy zoo.cfg
          1. I wrote a new 4lw, "long", that prints 100k '.'s followed by a '!' (so you can see if it's truncated). See long_4lw.patch.
          2. Slow down your localhost connection: sudo tc qdisc add dev lo root netem delay 1000ms
          (To undo this command: sudo tc qdisc del dev lo root)
          3. Start a zk server with ./bin/zkServer.sh start
          4. Send a long command: echo long | nc localhost 2181 (I've also been using FourLetterWordMain)
          This should fill your terminal with dots, but shouldn't end with a !. If you see the !, the output wasn't truncated (i.e., you haven't hit the bug).

          The new patch fixes the bug with this setup. Camille's patch also fixes it, although I do not fully understand how Since Camille reports having problems even after applying her patch (which I could not reproduce), this patch probably needs further testing as well. It would also be useful to test with a variety of clients (it sounds like this is the standard way to close a TCP connection, but I'm not positive every client will shut down the connection correctly).

          Show
          Skye Wanderman-Milne added a comment - Based on the second code snippet in the link Patrick found, I've written an alternate patch ( ZOOKEEPER-1197 _shutdown.patch) that waits for the client to shutdown the input connection before closing the socket. I've managed to reproduce the problem locally for testing: 0. My base setup: trunk zk checkout, Ubuntu 12.04, Oracle Java 6, no fancy zoo.cfg 1. I wrote a new 4lw, "long", that prints 100k '.'s followed by a '!' (so you can see if it's truncated). See long_4lw.patch. 2. Slow down your localhost connection: sudo tc qdisc add dev lo root netem delay 1000ms (To undo this command: sudo tc qdisc del dev lo root) 3. Start a zk server with ./bin/zkServer.sh start 4. Send a long command: echo long | nc localhost 2181 (I've also been using FourLetterWordMain) This should fill your terminal with dots, but shouldn't end with a !. If you see the !, the output wasn't truncated (i.e., you haven't hit the bug). The new patch fixes the bug with this setup. Camille's patch also fixes it, although I do not fully understand how Since Camille reports having problems even after applying her patch (which I could not reproduce), this patch probably needs further testing as well. It would also be useful to test with a variety of clients (it sounds like this is the standard way to close a TCP connection, but I'm not positive every client will shut down the connection correctly).
          Skye Wanderman-Milne made changes -
          Attachment ZOOKEEPER-1197_shutdown.patch [ 12547295 ]
          Attachment long_4lw.patch [ 12547296 ]
          Show
          Patrick Hunt added a comment - Interesting web page I found: http://ia600609.us.archive.org/22/items/TheUltimateSo_lingerPageOrWhyIsMyTcpNotReliable/the-ultimate-so_linger-page-or-why-is-my-tcp-not-reliable.html
          Camille Fournier made changes -
          Link This issue relates to ZOOKEEPER-1346 [ ZOOKEEPER-1346 ]
          Mahadev konar made changes -
          Fix Version/s 3.4.1 [ 12318650 ]
          Fix Version/s 3.3.5 [ 12319081 ]
          Patrick Hunt made changes -
          Fix Version/s 3.3.5 [ 12319081 ]
          Fix Version/s 3.3.4 [ 12316276 ]
          Camille Fournier made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Hide
          Patrick Hunt added a comment -

          Can you take a look at ZOOKEEPER-1237 ? I believe we're seeing the same issue for regular sessions. This shows up much more frequently once linger is turned off (at least in my testing of ZOOKEEPER-1049 that seems to be the case).

          Seems we need to do a better job coordinating session/4lw teardown across the board?

          Show
          Patrick Hunt added a comment - Can you take a look at ZOOKEEPER-1237 ? I believe we're seeing the same issue for regular sessions. This shows up much more frequently once linger is turned off (at least in my testing of ZOOKEEPER-1049 that seems to be the case). Seems we need to do a better job coordinating session/4lw teardown across the board?
          Mahadev konar made changes -
          Fix Version/s 3.4.1 [ 12318650 ]
          Fix Version/s 3.4.0 [ 12314469 ]
          Camille Fournier made changes -
          Priority Blocker [ 1 ] Critical [ 2 ]
          Hide
          Camille Fournier added a comment -

          I'm not entirely sure why closing the connection from the client isn't feasible. Right now, if we were to leave the client connection as it is, it would still close itself in nc non-interactive (and possibly truncate the data), and for other clients it would close when the client closed, or we could put a timeout around connections created with a 4lw since we can identify them. I think the trickier thing is that anything else that comes in off a 4lw connection causes an IO length exception which is nasty.

          I'll play with SO_LINGER again. It doesn't fix this problem in my testing, but my testing was under windows. If it makes it better in linux, it's still worth doing for that bit.

          Show
          Camille Fournier added a comment - I'm not entirely sure why closing the connection from the client isn't feasible. Right now, if we were to leave the client connection as it is, it would still close itself in nc non-interactive (and possibly truncate the data), and for other clients it would close when the client closed, or we could put a timeout around connections created with a 4lw since we can identify them. I think the trickier thing is that anything else that comes in off a 4lw connection causes an IO length exception which is nasty. I'll play with SO_LINGER again. It doesn't fix this problem in my testing, but my testing was under windows. If it makes it better in linux, it's still worth doing for that bit.
          Hide
          Mahadev konar added a comment -

          Camille,
          What do we want to do then? Closing the connection from client is probably not feasible. Should we just checkin what we have? I am not a big fan of letting the connections linger on the server and then close them later.

          Show
          Mahadev konar added a comment - Camille, What do we want to do then? Closing the connection from client is probably not feasible. Should we just checkin what we have? I am not a big fan of letting the connections linger on the server and then close them later.
          Hide
          Camille Fournier added a comment -

          I'm saying, in my testing, setting this value has never had any effect. It seem that the OS is ignoring this setting.

          Show
          Camille Fournier added a comment - I'm saying, in my testing, setting this value has never had any effect. It seem that the OS is ignoring this setting.
          Hide
          Flavio Junqueira added a comment -

          Could you elaborate please? My understanding is that the linger option is there exactly to allow unsent data to be transmitted, so I'm not sure why you say that it has no effect.

          Show
          Flavio Junqueira added a comment - Could you elaborate please? My understanding is that the linger option is there exactly to allow unsent data to be transmitted, so I'm not sure why you say that it has no effect.
          Hide
          Camille Fournier added a comment -

          socket linger has no effect on this bug.

          Show
          Camille Fournier added a comment - socket linger has no effect on this bug.
          Hide
          Flavio Junqueira added a comment -

          Camille, is setting the socket to linger an option here?

          Show
          Flavio Junqueira added a comment - Camille, is setting the socket to linger an option here?
          Hide
          Camille Fournier added a comment -

          One more update for this. After talking to some folks here that are more networking experts than I am, it appears that the only way to truly fix this bug is to allow the client to initiate the close of the socket, not the server. Because a socket close event (especially in recent Linux kernels) may cause the kernel to give up on sending data even when the server has not finished/received acks, the only way to guarantee that a client has actually received all the data is to wait for the client to close the connection. We can add a timeout in the server to clean up 4lw clients that have not closed their connection, but we cannot rely on all of the data getting to the client without an explicit client close.

          Show
          Camille Fournier added a comment - One more update for this. After talking to some folks here that are more networking experts than I am, it appears that the only way to truly fix this bug is to allow the client to initiate the close of the socket, not the server. Because a socket close event (especially in recent Linux kernels) may cause the kernel to give up on sending data even when the server has not finished/received acks, the only way to guarantee that a client has actually received all the data is to wait for the client to close the connection. We can add a timeout in the server to clean up 4lw clients that have not closed their connection, but we cannot rely on all of the data getting to the client without an explicit client close.
          Hide
          Camille Fournier added a comment -

          After some manual testing I've verified that this greatly improves the problem, but does not completely solve it, so I'm still stuck on this. Any help would be appreciated, this is a major annoyance when monitoring over any sort of WAN

          Show
          Camille Fournier added a comment - After some manual testing I've verified that this greatly improves the problem, but does not completely solve it, so I'm still stuck on this. Any help would be appreciated, this is a major annoyance when monitoring over any sort of WAN
          Patrick Hunt made changes -
          Fix Version/s 3.5.0 [ 12316644 ]
          Hide
          Thomas Koch added a comment -

          Could you please be so kind to add the Code for review to https://reviews.apache.org/r/new/ ? Thank you!

          Show
          Thomas Koch added a comment - Could you please be so kind to add the Code for review to https://reviews.apache.org/r/new/ ? Thank you!
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12495373/ZOOKEEPER-1197.patch
          against trunk revision 1172406.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/574//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/574//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/574//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12495373/ZOOKEEPER-1197.patch against trunk revision 1172406. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/574//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/574//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/574//console This message is automatically generated.
          Hide
          Camille Fournier added a comment -

          I'm not sure how to write a test for this. On a local connection, especially within the same VM, I really doubt that I can reproduce the issue (no luck so far). I have tested this manually.

          Show
          Camille Fournier added a comment - I'm not sure how to write a test for this. On a local connection, especially within the same VM, I really doubt that I can reproduce the issue (no luck so far). I have tested this manually.
          Camille Fournier made changes -
          Attachment ZOOKEEPER-1197.patch [ 12495373 ]
          Camille Fournier made changes -
          Field Original Value New Value
          Status Open [ 1 ] Patch Available [ 10002 ]
          Camille Fournier created issue -

            People

            • Assignee:
              Camille Fournier
              Reporter:
              Camille Fournier
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:

                Development