ZooKeeper
  1. ZooKeeper
  2. ZOOKEEPER-737

some 4 letter words may fail with netcat (nc)

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 3.3.0
    • Fix Version/s: 3.3.1, 3.4.0
    • Component/s: server
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      nc closes the write channel as soon as it's sent it's information, for example "echo stat|nc localhost 2181"
      in general this is fine, however the server code will close the socket as soon as it receives notice that nc has
      closed it's write channel. if not all the 4 letter word result has been written back to the client yet, this will cause
      some or all of the result to be lost - ie the client will not see the full result. this was introduced in 3.3.0 as part
      of a change to reduce blocking of the selector by long running 4letter words.

      here's an example of the logs from the server during this

      echo -n stat | nc localhost 2181
      2010-04-09 21:55:36,124 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn$Factory@251] - Accepted socket connection from /127.0.0.1:42179
      2010-04-09 21:55:36,124 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@968] - Processing stat command from /127.0.0.1:42179
      2010-04-09 21:55:36,125 - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@606] - EndOfStreamException: Unable to read additional data from client sessionid 0x0, likely client has closed socket
      2010-04-09 21:55:36,125 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1286] - Closed socket connection for client /127.0.0.1:42179 (no session established for client)
      [phunt@gsbl90850 zookeeper-3.3.0]$ 2010-04-09 21:55:36,126 - ERROR [Thread-15:NIOServerCnxn@422] - Unexpected Exception:
      java.nio.channels.CancelledKeyException
      at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55)
      at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59)
      at org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:395)
      at org.apache.zookeeper.server.NIOServerCnxn$SendBufferWriter.checkFlush(NIOServerCnxn.java:907)
      at org.apache.zookeeper.server.NIOServerCnxn$SendBufferWriter.flush(NIOServerCnxn.java:945)
      at java.io.BufferedWriter.flush(BufferedWriter.java:236)
      at java.io.PrintWriter.flush(PrintWriter.java:276)
      at org.apache.zookeeper.server.NIOServerCnxn$2.run(NIOServerCnxn.java:1089)
      2010-04-09 21:55:36,126 - ERROR [Thread-15:NIOServerCnxn$Factory$1@82] - Thread Thread[Thread-15,5,main] died
      java.nio.channels.CancelledKeyException
      at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55)
      at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:64)
      at org.apache.zookeeper.server.NIOServerCnxn$SendBufferWriter.wakeup(NIOServerCnxn.java:927)
      at org.apache.zookeeper.server.NIOServerCnxn$SendBufferWriter.checkFlush(NIOServerCnxn.java:909)
      at org.apache.zookeeper.server.NIOServerCnxn$SendBufferWriter.flush(NIOServerCnxn.java:945)
      at java.io.BufferedWriter.flush(BufferedWriter.java:236)
      at java.io.PrintWriter.flush(PrintWriter.java:276)
      at org.apache.zookeeper.server.NIOServerCnxn$2.run(NIOServerCnxn.java:1089)

      1. ZOOKEEPER-737.patch
        24 kB
        Mahadev konar
      2. ZOOKEEPER-737.patch
        8 kB
        Mahadev konar
      3. ZOOKEEPER-737.patch
        8 kB
        Mahadev konar
      4. ZOOKEEPER-737.patch
        8 kB
        Mahadev konar
      5. ZOOKEEPER-737.patch
        9 kB
        Patrick Hunt
      6. ZOOKEEPER-737.patch
        8 kB
        Mahadev konar
      7. ZOOKEEPER-737.patch
        7 kB
        Mahadev konar

        Issue Links

          Activity

          Hide
          Camille Fournier added a comment -

          I think that makes sense. We should be very clear about the fact that we can't support nc non-interactive with netty unless netty actually starts to support it.

          Show
          Camille Fournier added a comment - I think that makes sense. We should be very clear about the fact that we can't support nc non-interactive with netty unless netty actually starts to support it.
          Hide
          Patrick Hunt added a comment -

          4) use netty to implement this (no nio)

          Show
          Patrick Hunt added a comment - 4) use netty to implement this (no nio)
          Hide
          Patrick Hunt added a comment -

          How's this?

          One option that would make this a lot less ugly would be to deprecate support for 4letterwords on the client port in 3.5.0 (dep, not remove, although provide an option to turn off via config), we'd designate a new port specifically for 4letterwords. We'd fix this problem on the new port, the old port would have the existing limitations.

          3.6.0 we would remove 4lw on client port entirely.

          This is good for a number of reasons IMO – one is that having 4lw on the client port is a bit of a security issue in some customers - as any client has access to this port and it cannot by definition be firewalled. Many admins would like to firewall this.

          In the process we could:
          1) make the new port fully b/w compatible with the existing functionality
          2) enable long lived sessions rather than polling
          3) provide support for "extended command syntax" which would enhance 4lw features (for example to control the format of the output)
          4) etc...

          I realize this doesn't fix the existing, but it does provide for a reasonable way forward – we'll pay off some debt.

          Show
          Patrick Hunt added a comment - How's this? One option that would make this a lot less ugly would be to deprecate support for 4letterwords on the client port in 3.5.0 (dep, not remove, although provide an option to turn off via config), we'd designate a new port specifically for 4letterwords. We'd fix this problem on the new port, the old port would have the existing limitations. 3.6.0 we would remove 4lw on client port entirely. This is good for a number of reasons IMO – one is that having 4lw on the client port is a bit of a security issue in some customers - as any client has access to this port and it cannot by definition be firewalled. Many admins would like to firewall this. In the process we could: 1) make the new port fully b/w compatible with the existing functionality 2) enable long lived sessions rather than polling 3) provide support for "extended command syntax" which would enhance 4lw features (for example to control the format of the output) 4) etc... I realize this doesn't fix the existing, but it does provide for a reasonable way forward – we'll pay off some debt.
          Hide
          Camille Fournier added a comment -

          I've taken a lot of wrong steps on this one, but I think I have a solution.

          Here's the problem as I see it: We can't cancel that selector key, because if we do that, the socket close will cut off the connection to the client before all the data has flushed through the network. I have validated that the data is all flushed to the byte buffer, but it never gets to the client if there is network slowness and it can't all successfully arrive before the CommandThread gets to the call to close. This can happen even in nc noninteractive mode (as we have observed with several false weekend alerts).

          However, if we don't cancel the selector key, we'll potentially see a cancelled key exception or an EOF exception in our selector loops in non-interactive netcat, and that will preemptively close the socket.

          So we need to keep the key from being cancelled by our processes to ensure the data gets completely sent, but we also need to ignore errors from selecting this key in the case of a 4lw.

          Right now, my solution is a total ugly hack that looks something like:

          create a boolean in NIOServerCnxn called "ignoreClose", initally set to false. In checkFourLetterWord, set this boolean to true. In the Factory run loop, if we get a CancelledKeyException, with a NIOServerCnxn attachment, and the boolean set to true, ignore the exception. In doIO, if we get an EOFException (possibly any exception) and this boolean is set to true, ignore the exception. This lets us ignore the effects of a cancelled/closed incoming connection from nc without losing data on the socket for times when a 4lw needs a lot of data to be sent or a distance to send it.

          Thoughts on this? It has been a bit of a nightmare to figure out, and my googling seems to indicate that netty won't support nc at all ( https://issues.jboss.org/browse/NETTY-236 ).

          Show
          Camille Fournier added a comment - I've taken a lot of wrong steps on this one, but I think I have a solution. Here's the problem as I see it: We can't cancel that selector key, because if we do that, the socket close will cut off the connection to the client before all the data has flushed through the network. I have validated that the data is all flushed to the byte buffer, but it never gets to the client if there is network slowness and it can't all successfully arrive before the CommandThread gets to the call to close. This can happen even in nc noninteractive mode (as we have observed with several false weekend alerts). However, if we don't cancel the selector key, we'll potentially see a cancelled key exception or an EOF exception in our selector loops in non-interactive netcat, and that will preemptively close the socket. So we need to keep the key from being cancelled by our processes to ensure the data gets completely sent, but we also need to ignore errors from selecting this key in the case of a 4lw. Right now, my solution is a total ugly hack that looks something like: create a boolean in NIOServerCnxn called "ignoreClose", initally set to false. In checkFourLetterWord, set this boolean to true. In the Factory run loop, if we get a CancelledKeyException, with a NIOServerCnxn attachment, and the boolean set to true, ignore the exception. In doIO, if we get an EOFException (possibly any exception) and this boolean is set to true, ignore the exception. This lets us ignore the effects of a cancelled/closed incoming connection from nc without losing data on the socket for times when a 4lw needs a lot of data to be sent or a distance to send it. Thoughts on this? It has been a bit of a nightmare to figure out, and my googling seems to indicate that netty won't support nc at all ( https://issues.jboss.org/browse/NETTY-236 ).
          Hide
          Camille Fournier added a comment -

          Let me see what I can do.

          Show
          Camille Fournier added a comment - Let me see what I can do.
          Hide
          Patrick Hunt added a comment -

          Sounds like a good fix for 3.3.4. Is there any automated testing (can you add to existing), or just manual?

          Show
          Patrick Hunt added a comment - Sounds like a good fix for 3.3.4. Is there any automated testing (can you add to existing), or just manual?
          Hide
          Camille Fournier added a comment -

          I was doing a mixture of testing against 3.4 (for netty) and 3.3 (for everything else). The soLinger change didn't make it into 3.3.3. Should I put it into there? I think that is the right solution.

          Show
          Camille Fournier added a comment - I was doing a mixture of testing against 3.4 (for netty) and 3.3 (for everything else). The soLinger change didn't make it into 3.3.3. Should I put it into there? I think that is the right solution.
          Hide
          Benjamin Reed added a comment -

          that is weird, since we should have already done setSoLinger(false, -1) when the socket was accepted. (see the constructor for NIOServerCnxn.) perhaps the slight delay of invoking the setsolinger gives the packets enough time to get out...

          Show
          Benjamin Reed added a comment - that is weird, since we should have already done setSoLinger(false, -1) when the socket was accepted. (see the constructor for NIOServerCnxn.) perhaps the slight delay of invoking the setsolinger gives the packets enough time to get out...
          Hide
          Camille Fournier added a comment -

          So here's an interesting observation: If I do nothing but set the socket SoLinger to
          sock.socket().setSoLinger(false, -1);

          everything actually seems to work. Except nc interactive, which still fails, but I don't care about that. I'm going to suggest we add the fix for ZOOKEEPER-1049 into 3.3.4. We still need to fix netty.

          Show
          Camille Fournier added a comment - So here's an interesting observation: If I do nothing but set the socket SoLinger to sock.socket().setSoLinger(false, -1); everything actually seems to work. Except nc interactive, which still fails, but I don't care about that. I'm going to suggest we add the fix for ZOOKEEPER-1049 into 3.3.4. We still need to fix netty.
          Hide
          Camille Fournier added a comment -

          I have no idea, this isn't on ubuntu though, I suspect it's traditional.
          Should I make a ticket for this fix? I don't feel like I fully understand the problem myself but if you guys think that just extending the SO_LINGER for the 4lws is the right way to go for the NIO version, I'm happy to make a patch and test it out.

          Show
          Camille Fournier added a comment - I have no idea, this isn't on ubuntu though, I suspect it's traditional. Should I make a ticket for this fix? I don't feel like I fully understand the problem myself but if you guys think that just extending the SO_LINGER for the 4lws is the right way to go for the NIO version, I'm happy to make a patch and test it out.
          Hide
          Patrick Hunt added a comment -

          which version of nc are you using? The newer (ubuntu pkging) openbsd or "traditional"? You might try both.

          We got it to work well with traditional, then ubuntu went and made openbsd the default:

          phunt@ubuntu:~$ nc
          This is nc from the netcat-openbsd package. An alternative nc is available
          in the netcat-traditional package.
          usage: nc [-46DdhklnrStUuvzC] [-i interval] [-P proxy_username] [-p source_port]
          	  [-s source_ip_address] [-T ToS] [-w timeout] [-X proxy_protocol]
          	  [-x proxy_address[:port]] [hostname] [port[s]]
          phunt@ubuntu:~$ ls /bin/nc.*
          /bin/nc.openbsd*  /bin/nc.traditional*
          
          Show
          Patrick Hunt added a comment - which version of nc are you using? The newer (ubuntu pkging) openbsd or "traditional"? You might try both. We got it to work well with traditional, then ubuntu went and made openbsd the default: phunt@ubuntu:~$ nc This is nc from the netcat-openbsd package. An alternative nc is available in the netcat-traditional package. usage: nc [-46DdhklnrStUuvzC] [-i interval] [-P proxy_username] [-p source_port] [-s source_ip_address] [-T ToS] [-w timeout] [-X proxy_protocol] [-x proxy_address[:port]] [hostname] [port[s]] phunt@ubuntu:~$ ls /bin/nc.* /bin/nc.openbsd* /bin/nc.traditional*
          Hide
          Camille Fournier added a comment -

          Netty is apparently even worse, as it hangs the connection and I can't even ctrl-c out of it in telnet.

          Show
          Camille Fournier added a comment - Netty is apparently even worse, as it hangs the connection and I can't even ctrl-c out of it in telnet.
          Hide
          Benjamin Reed added a comment -

          does everything work ok with netty?

          Show
          Benjamin Reed added a comment - does everything work ok with netty?
          Hide
          Camille Fournier added a comment -

          That seems to maybe fix telnet, but not nc interactive. Does that matter?
          Feels like yet another reason to get to Netty.

          Show
          Camille Fournier added a comment - That seems to maybe fix telnet, but not nc interactive. Does that matter? Feels like yet another reason to get to Netty.
          Hide
          Benjamin Reed added a comment -

          i'm wondering if it is the linger... would you be able to patch a server with:

          diff --git src/java/main/org/apache/zookeeper/server/NIOServerCnxn.java src/java/main/org/apache/zookeeper/server/NIOServerCnxn.java
          index 1e8c68e..5f943e9 100644
          — src/java/main/org/apache/zookeeper/server/NIOServerCnxn.java
          +++ src/java/main/org/apache/zookeeper/server/NIOServerCnxn.java
          @@ -494,6 +494,9 @@ public class NIOServerCnxn extends ServerCnxn {
          public void run() {
          try {
          commandRun();
          + if (sock != null)

          { + sock.socket().setSoLinger(true, 15); + }

          } catch (IOException ie)

          { LOG.error("Error in running command ", ie); }

          finally {

          and see if it still happens?

          Show
          Benjamin Reed added a comment - i'm wondering if it is the linger... would you be able to patch a server with: diff --git src/java/main/org/apache/zookeeper/server/NIOServerCnxn.java src/java/main/org/apache/zookeeper/server/NIOServerCnxn.java index 1e8c68e..5f943e9 100644 — src/java/main/org/apache/zookeeper/server/NIOServerCnxn.java +++ src/java/main/org/apache/zookeeper/server/NIOServerCnxn.java @@ -494,6 +494,9 @@ public class NIOServerCnxn extends ServerCnxn { public void run() { try { commandRun(); + if (sock != null) { + sock.socket().setSoLinger(true, 15); + } } catch (IOException ie) { LOG.error("Error in running command ", ie); } finally { and see if it still happens?
          Hide
          Camille Fournier added a comment -

          One more thing to add, when I run this with "interactive" netcat, I see the same problem:
          nc foo 2181
          dump
          SessionTracker dump:
          Session Sets (8):
          0 expire at Mon Sep 12 17:31:12 EDT 2011:
          0 expire at Mon Sep 12 17:31:14 EDT 2011:
          0 expire at Mon Sep 12 17:31:16 EDT 2011:
          ....
          0x23259a57465000a
          0x33259a5746d0009
          0x33259a5746-bash-3.2$

          Does work ok if I do echo -n dump | nc foo 2181

          Show
          Camille Fournier added a comment - One more thing to add, when I run this with "interactive" netcat, I see the same problem: nc foo 2181 dump SessionTracker dump: Session Sets (8): 0 expire at Mon Sep 12 17:31:12 EDT 2011: 0 expire at Mon Sep 12 17:31:14 EDT 2011: 0 expire at Mon Sep 12 17:31:16 EDT 2011: .... 0x23259a57465000a 0x33259a5746d0009 0x33259a5746-bash-3.2$ Does work ok if I do echo -n dump | nc foo 2181
          Hide
          Camille Fournier added a comment -

          Yeah, I don't know anything about the difference between nc or telnet, or what zkdashboard is using, but this is with telnet interactive (reproducing a problem we see in zkdashboard). It's reproducible but tricky. If I run stat/dump from a remote server into a leader with a lot of connections/ephemerals, it reliably fails. It prints out some of the data and closes the connection suddenly. For example, the end of a dump command:
          14 expire at Mon Sep 12 14:14:04 EDT 2011:
          0x1325208b8c90089
          0x2325208b8c30099
          0x4325208b8ca0113
          0x1325208b8c900cd
          0x2325208b8c3009d
          0x5325211469900ae
          0x2325208b8c30091
          0x4325208b8ca00b7
          0x1325208b8c90094
          0x1325208b8c90090
          0x5325211469900d7
          0x5325211469900a4
          0x2325208b8c3009e
          0x2325208b8c3009c
          0 expire at Mon Sep 12 14:14:10 EDT 2011:Connection closed by foreign host.

          I tried a bit of debugging against the running server. Breakpoints anywhere inside the dump thread before the exit of closeSock() will cause the problem not to occur. But a breakpoint at the exit of closeSock() will still show the problem.

          This is "a lot" of ephemerals/sessions in that we're talking about ~120 sessions and 88 ephemerals. Hardly thousands.

          Show
          Camille Fournier added a comment - Yeah, I don't know anything about the difference between nc or telnet, or what zkdashboard is using, but this is with telnet interactive (reproducing a problem we see in zkdashboard). It's reproducible but tricky. If I run stat/dump from a remote server into a leader with a lot of connections/ephemerals, it reliably fails. It prints out some of the data and closes the connection suddenly. For example, the end of a dump command: 14 expire at Mon Sep 12 14:14:04 EDT 2011: 0x1325208b8c90089 0x2325208b8c30099 0x4325208b8ca0113 0x1325208b8c900cd 0x2325208b8c3009d 0x5325211469900ae 0x2325208b8c30091 0x4325208b8ca00b7 0x1325208b8c90094 0x1325208b8c90090 0x5325211469900d7 0x5325211469900a4 0x2325208b8c3009e 0x2325208b8c3009c 0 expire at Mon Sep 12 14:14:10 EDT 2011:Connection closed by foreign host. I tried a bit of debugging against the running server. Breakpoints anywhere inside the dump thread before the exit of closeSock() will cause the problem not to occur. But a breakpoint at the exit of closeSock() will still show the problem. This is "a lot" of ephemerals/sessions in that we're talking about ~120 sessions and 88 ephemerals. Hardly thousands.
          Hide
          Benjamin Reed added a comment -

          telnet works a bit different than nc. are you using it in an interactive way? (in other words, are you typing the commands?)

          is the problem that the results are getting truncated? i assume the problem is reproducible correct?

          Show
          Benjamin Reed added a comment - telnet works a bit different than nc. are you using it in an interactive way? (in other words, are you typing the commands?) is the problem that the results are getting truncated? i assume the problem is reproducible correct?
          Hide
          Camille Fournier added a comment -

          Hey guys,

          I'm still seeing this problem when using telnet to get 4 letter word results. This is on servers running 3.3.3. I'm at a loss for how to debug this problem, does anyone have any suggestion? It's causing the zk dashboard to fail when there are more than 100 or so connections on a far DC server.

          Show
          Camille Fournier added a comment - Hey guys, I'm still seeing this problem when using telnet to get 4 letter word results. This is on servers running 3.3.3. I'm at a loss for how to debug this problem, does anyone have any suggestion? It's causing the zk dashboard to fail when there are more than 100 or so connections on a far DC server.
          Hide
          Mahadev konar added a comment -

          I just committed this to trunk and 3.3 branch.

          Show
          Mahadev konar added a comment - I just committed this to trunk and 3.3 branch.
          Hide
          Benjamin Reed added a comment -

          +1 great job mahadev

          Show
          Benjamin Reed added a comment - +1 great job mahadev
          Hide
          Mahadev konar added a comment -

          ben, can you take a look at it? We should commit this ASAP, since 3.3.1 is waiting on it.

          Show
          Mahadev konar added a comment - ben, can you take a look at it? We should commit this ASAP, since 3.3.1 is waiting on it.
          Hide
          Patrick Hunt added a comment -

          +1 latest patch looks good to me. Did a bunch of system testing and it was working fine (used netstat to verify).

          Show
          Patrick Hunt added a comment - +1 latest patch looks good to me. Did a bunch of system testing and it was working fine (used netstat to verify).
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12443310/ZOOKEEPER-737.patch
          against trunk revision 939172.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/80/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/80/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/80/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12443310/ZOOKEEPER-737.patch against trunk revision 939172. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/80/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/80/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/80/console This message is automatically generated.
          Hide
          Mahadev konar added a comment -

          this patch makes all commands to run as threads. Also, this includes refactoring of the command code into seperate classes. I am really looking forward to Netty, this is just getting too hairy in NIOServerCnxn!

          Show
          Mahadev konar added a comment - this patch makes all commands to run as threads. Also, this includes refactoring of the command code into seperate classes. I am really looking forward to Netty, this is just getting too hairy in NIOServerCnxn!
          Hide
          Mahadev konar added a comment -

          very good point ben......

          Show
          Mahadev konar added a comment - very good point ben......
          Hide
          Benjamin Reed added a comment -

          -1 we have a problem for the cases when a thread isn't spawned to handle the command. in those cases the send could block which would block the NIO thread.

          Show
          Benjamin Reed added a comment - -1 we have a problem for the cases when a thread isn't spawned to handle the command. in those cases the send could block which would block the NIO thread.
          Hide
          Mahadev konar added a comment -

          ben, would you be able to review this patch?

          Show
          Mahadev konar added a comment - ben, would you be able to review this patch?
          Hide
          Patrick Hunt added a comment -

          Looks good to me, +1. Tests pass on my system. Would be a good idea to get a review from another commiter though.

          Show
          Patrick Hunt added a comment - Looks good to me, +1. Tests pass on my system. Would be a good idea to get a review from another commiter though.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12443006/ZOOKEEPER-737.patch
          against trunk revision 938212.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/75/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/75/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/75/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12443006/ZOOKEEPER-737.patch against trunk revision 938212. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/75/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/75/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/75/console This message is automatically generated.
          Hide
          Mahadev konar added a comment -

          Updated the patch to use blocking IO for commands so that we dont have any spinning issues with the socket channels.

          Show
          Mahadev konar added a comment - Updated the patch to use blocking IO for commands so that we dont have any spinning issues with the socket channels.
          Hide
          Mahadev konar added a comment -

          this issue could be possible only if the client's tcp buffer is filled up and the write buffer is also filled up. The defaults for WAN specifications are:

          tcp read/write sysctl config min default max
          net.ipv4.tcp_rmem 4096 87380 174760
          net.ipv4.tcp_wmem 4096 16384 131072

          In a non WAN tcp connection the read buffer should be atleast 87KB and write buffer should be 16KB, which gives us a total of 100KB, which should be good enough (except for huge ephemeral dumps on a large tree). Even in that case according to my calculation we should be able to support 50, 000 ephemeral nodes, in which case the client will not be reading the data but the server is still writitng it out!

          Also, much more importantly all commands will be called via nc/telnet which should not be malformed (not reading off the socket).

          Show
          Mahadev konar added a comment - this issue could be possible only if the client's tcp buffer is filled up and the write buffer is also filled up. The defaults for WAN specifications are: tcp read/write sysctl config min default max net.ipv4.tcp_rmem 4096 87380 174760 net.ipv4.tcp_wmem 4096 16384 131072 In a non WAN tcp connection the read buffer should be atleast 87KB and write buffer should be 16KB, which gives us a total of 100KB, which should be good enough (except for huge ephemeral dumps on a large tree). Even in that case according to my calculation we should be able to support 50, 000 ephemeral nodes, in which case the client will not be reading the data but the server is still writitng it out! Also, much more importantly all commands will be called via nc/telnet which should not be malformed (not reading off the socket).
          Hide
          Patrick Hunt added a comment -

          Seems like this could infinite loop in some bad cases:

          sendbuffersync
          while(bb.remaining() != 0) {
          if (sock != null)

          { sock.write(bb); }

          }

          say the client requests, then never reads the response (and the response is very large, so it fills up the tcp buffers and such).

          Show
          Patrick Hunt added a comment - Seems like this could infinite loop in some bad cases: sendbuffersync while(bb.remaining() != 0) { if (sock != null) { sock.write(bb); } } say the client requests, then never reads the response (and the response is very large, so it fills up the tcp buffers and such).
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12442903/ZOOKEEPER-737.patch
          against trunk revision 938212.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/148/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/148/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/148/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12442903/ZOOKEEPER-737.patch against trunk revision 938212. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/148/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/148/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/148/console This message is automatically generated.
          Hide
          Mahadev konar added a comment -

          addressed pat's comments on the logging and try and finally block.

          If cancel key fails, the socket will be closed later to make sure there is no resource leak.

          Show
          Mahadev konar added a comment - addressed pat's comments on the logging and try and finally block. If cancel key fails, the socket will be closed later to make sure there is no resource leak.
          Hide
          Patrick Hunt added a comment -

          Mahadev, this looks great, a few issues I noticed:

          If the pwriter flush/close calls fail for any reason we will not close the socket. In cleanupWriterSocket I think you should have a try/finally around the pwriter conditional block. This will ensure that no matter what happens we will close the socket.

          What happens if cancel key fails? Is this really a debug level message? What would happen if cancel fails but later the socket is explicitly closed (after command returns)?

          + if (k != null) {
          + try

          { + k.cancel(); + }

          catch(Exception e)

          { + LOG.debug("Error cancelling command selection key ", e); + }

          Are these errors or info messages? Seems like both are LOG.error? (ie we'd really want to know if this happened)

          LOG.info("Error sending data synchronously ", ie);
          LOG.info("Error closing a command socket ", e);

          Show
          Patrick Hunt added a comment - Mahadev, this looks great, a few issues I noticed: If the pwriter flush/close calls fail for any reason we will not close the socket. In cleanupWriterSocket I think you should have a try/finally around the pwriter conditional block. This will ensure that no matter what happens we will close the socket. What happens if cancel key fails? Is this really a debug level message? What would happen if cancel fails but later the socket is explicitly closed (after command returns)? + if (k != null) { + try { + k.cancel(); + } catch(Exception e) { + LOG.debug("Error cancelling command selection key ", e); + } Are these errors or info messages? Seems like both are LOG.error? (ie we'd really want to know if this happened) LOG.info("Error sending data synchronously ", ie); LOG.info("Error closing a command socket ", e);
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12442478/ZOOKEEPER-737.patch
          against trunk revision 936624.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/68/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/68/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/68/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12442478/ZOOKEEPER-737.patch against trunk revision 936624. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/68/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/68/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/68/console This message is automatically generated.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12442478/ZOOKEEPER-737.patch
          against trunk revision 934312.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/61/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/61/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/61/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12442478/ZOOKEEPER-737.patch against trunk revision 934312. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/61/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/61/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/61/console This message is automatically generated.
          Hide
          Mahadev konar added a comment -

          changed (0x0) for interest ops to be (0), to keep it consistent with how we print interest ops.

          Show
          Mahadev konar added a comment - changed (0x0) for interest ops to be (0), to keep it consistent with how we print interest ops.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12442455/ZOOKEEPER-737.patch
          against trunk revision 934312.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/60/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/60/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/60/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12442455/ZOOKEEPER-737.patch against trunk revision 934312. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/60/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/60/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/60/console This message is automatically generated.
          Hide
          Patrick Hunt added a comment -

          This updated patch verifies that we can support the case where the write channel is closed (client perspective) before the server has fully responded. This fails prior to Mahadev's fix being applied.

          Show
          Patrick Hunt added a comment - This updated patch verifies that we can support the case where the write channel is closed (client perspective) before the server has fully responded. This fails prior to Mahadev's fix being applied.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12442394/ZOOKEEPER-737.patch
          against trunk revision 934312.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no tests are needed for this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/59/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/59/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/59/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12442394/ZOOKEEPER-737.patch against trunk revision 934312. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/59/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/59/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/59/console This message is automatically generated.
          Hide
          Mahadev konar added a comment -

          this patch fixes the issue. The problem was the cons wasnt returning anything if there are only 4 letters commands as connections. This fixes the issue to return stats for that.

          I will be adding test case shortly.

          Show
          Mahadev konar added a comment - this patch fixes the issue. The problem was the cons wasnt returning anything if there are only 4 letters commands as connections. This fixes the issue to return stats for that. I will be adding test case shortly.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12442363/ZOOKEEPER-737.patch
          against trunk revision 934312.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no tests are needed for this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/58/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/58/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/58/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12442363/ZOOKEEPER-737.patch against trunk revision 934312. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/58/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/58/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/58/console This message is automatically generated.
          Hide
          Patrick Hunt added a comment -

          Yes, there are 2 sets of tests, one for standalone and one for quorum.

          FourLetterWordsQuorumTest.java
          FourLetterWordsTest.java

          However there is a flaw in these tests. They use a socket to do the test, ie connect, send, verify resp, close socket

          This is not what NC does, it does a connect, send, close write channel, verify resp, close socket

          and that's why nc fails while everything else was fine.

          I'd suggest adding a new set of tests that not only do the old test, but also replicate nc. It would also be good to verify
          that the connection count stays the same (ie that we don't leak connections).

          Should I do the changes for the tests or do you want to?

          Show
          Patrick Hunt added a comment - Yes, there are 2 sets of tests, one for standalone and one for quorum. FourLetterWordsQuorumTest.java FourLetterWordsTest.java However there is a flaw in these tests. They use a socket to do the test, ie connect, send, verify resp, close socket This is not what NC does, it does a connect, send, close write channel, verify resp, close socket and that's why nc fails while everything else was fine. I'd suggest adding a new set of tests that not only do the old test, but also replicate nc. It would also be good to verify that the connection count stays the same (ie that we don't leak connections). Should I do the changes for the tests or do you want to?
          Hide
          Mahadev konar added a comment -

          please try this patch out with all kinds of commands.

          also, pat do you have a testcase for four letter words that I can add a testcase to?

          Show
          Mahadev konar added a comment - please try this patch out with all kinds of commands. also, pat do you have a testcase for four letter words that I can add a testcase to?
          Hide
          Mahadev konar added a comment -

          this patch fixes the issue.

          • it removes the selection key from the selector via calling selectionkey.cancel()
          • sends the output of the stat command via the socket and does not use the selector for sending it
          • closes the socket appropriately after the server is done sending the response to the client
          Show
          Mahadev konar added a comment - this patch fixes the issue. it removes the selection key from the selector via calling selectionkey.cancel() sends the output of the stat command via the socket and does not use the selector for sending it closes the socket appropriately after the server is done sending the response to the client
          Hide
          Patrick Hunt added a comment -

          fyi, typically this is not an issue for 4 letter word clients as they open the socket, write the request, read the response
          then close the socket. However it seems nc is closing the write channel as soon as it sends all it has to send (before it
          necess. reads). As a result you get this timing issue with nc.

          Show
          Patrick Hunt added a comment - fyi, typically this is not an issue for 4 letter word clients as they open the socket, write the request, read the response then close the socket. However it seems nc is closing the write channel as soon as it sends all it has to send (before it necess. reads). As a result you get this timing issue with nc.

            People

            • Assignee:
              Mahadev konar
              Reporter:
              Patrick Hunt
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development