Uploaded image for project: 'ZooKeeper'
  1. ZooKeeper
  2. ZOOKEEPER-3394

Delay observer reconnect when all learner masters have been tried

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 3.6.0
    • Fix Version/s: 3.6.0
    • Component/s: quorum

      Description

      Observers will disconnect when the voting peers perform a leader election and reconnect after. The delay zookeeper.observer.reconnectDelayMs was added to insulate the voting peers from the observers returning. With a large number of peers and the observerMaster feature active, this delay is mostly detrimental as it means that the observer is more likely to get hung up on connecting to a bad (down/corrupt) peer and it would be better off switching to a new one quickly.

      To retain the protective virtue of the delay, it makes sense to add a delay that after all observer master's in the list have been tried before iterating through the list again. In the case where observer master's are not active, this degenerates to a delay between connection attempts on the leader.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                nixon Brian Nixon
                Reporter:
                nixon Brian Nixon
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 2h
                  2h