Uploaded image for project: 'ZooKeeper'
  1. ZooKeeper
  2. ZOOKEEPER-3894

Out-of-order response after session moved



    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: server
    • Labels:


      A bug in NIOServerCnxn can result in a client failing with an error about out of order xids. What actually happens, as I understand it, is:

      1. Client attempts to renew its session on slow server S1.
      2. The attempt times out.
      3. Client attempts to renew its session on server S2.
      4. The attempt succeeds. S2 now owns the session.
      5. The client sends one or more requests. The responses are large enough that they fill the socket's buffer in S2.
      6. The original attempt finally succeeds. S1 now owns the session, but the client is still connected to S2.
      7. The client sends an asynchronous request A to S2. Because the session has moved, S2 instructs the NIOServerCnxn to close. This is implemented as an empty sentinel value added to the queue of outgoing buffers.
      8. The client sends some read request B to S2, and the response is enqueued behind the sentinel.
      9. The doIO method of NIOServerCnxn writes its enqueued buffers to the socket, and then it closes the socket because one of the buffers was the sentinel.
      10. Before the client observes that the socket it closed, it receives the response for B, and fails with an error because it expected the response for A.

      I think the fix is simply to avoid writing messages that were enqueued after the sentinel.




            • Assignee:
              jmcarthur Jake McArthur
            • Votes:
              0 Vote for this issue
              1 Start watching this issue


              • Created: