Uploaded image for project: 'ZooKeeper'
  1. ZooKeeper
  2. ZOOKEEPER-3086

[server] Lack of write timeouts causes quorum to stuck

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 3.5.4, 3.4.12
    • Fix Version/s: None
    • Component/s: quorum
    • Labels:
      None
    • Environment:

      Linux 4.13.0-32-generic, Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)

      Description

      Network outage on leader host can cause `QuorumPeer` thread to stuck for prolonged period of time (2+ hours, depends on tcp keep alive settings). It effectively stalls the whole zookeeper server making it inoperable. We've found it during one of our internal DRTs (Disaster Recovery Test).

      The scenario which triggers the behavior (requires relatively high ping-load to the follower):

      1. `Follower.processPacket` processes `Leader.PING` message
      2. Leader is network partitioned
      3. `Learner.ping` makes attempt to write to the leader socket
      4. If write socket buffer is full (due to other ping/sync calls) `Learner.ping` blocks
      5. As leader is partitioned - `Learner.ping` blocks forever due to lack of write timeout
      6. `QuorumPeer` is the only thread reading from the leader socket, effectively meaning that the whole server is stuck and can't recover without manual process restart.

       

      Thread dump from the affected server is in attachments.

        Attachments

        1. zookeeper-threads.txt
          150 kB
          Ruslan Nigmatullin

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              elessar Ruslan Nigmatullin
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated: