Uploaded image for project: 'ZooKeeper'
  1. ZooKeeper
  2. ZOOKEEPER-140

Deadlock in QuorumCnxManager

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 3.0.0
    • 3.0.0
    • None
    • None

    Description

      Frequently the servers deadlock in QuorumCnxManager:initiateConnection on
      s.read(msgBuffer) when reading the challenge from the peer.

      Calls to initiateConnection and receiveConnection are synchronized, so only one or the other can be executing at a time. This prevents two connections from opening between the same pair of servers.

      However, it seems that this leads to deadlock, as in this scenario:

      A (initiate --> B)
      B (initiate --> C)
      C (initiate --> A)
      

      initiateConnection can only complete when receiveConnection runs on the remote peer and answers the challenge. If all servers are blocked in initiateConnection, receiveConnection never runs and leader election halts.

      Attachments

        Issue Links

          Activity

            People

              fpj Flavio Paiva Junqueira
              fpj Flavio Paiva Junqueira
              Votes:
              1 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: