Uploaded image for project: 'ZooKeeper'
  1. ZooKeeper
  2. ZOOKEEPER-3537

Leader election - Use of out of election messages

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Trivial
    • Resolution: Fixed
    • None
    • 3.6.0
    • None

    Description

      Hello ZooKeeper developers,

      in lookForLeader in FastLeaderElection there is the following switch block in case a notification message n is received where n.state is either FOLLOWING or LEADING (https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/FastLeaderElection.java#L1029).

      case FOLLOWING:
      case LEADING:
        /*
         * Consider all notifications from the same epoch
         * together.
         */
        if (n.electionEpoch == logicalclock.get()) {
          recvset.put(n.sid, new Vote(n.leader, n.zxid, n.electionEpoch, n.peerEpoch));
          voteSet = getVoteTracker(recvset, new Vote(n.version, n.leader, n.zxid, n.electionEpoch, n.peerEpoch, n.state));
          if (voteSet.hasAllQuorums() && checkLeader(outofelection, n.leader, n.electionEpoch)) {
            setPeerState(n.leader, voteSet);
            Vote endVote = new Vote(n.leader, n.zxid, n.electionEpoch, n.peerEpoch);
            leaveInstance(endVote);
            return endVote;
          }
        }
      
        /*
         * Before joining an established ensemble, verify that
         * a majority are following the same leader.
         */
        outofelection.put(n.sid, new Vote(n.version, n.leader, n.zxid, n.electionEpoch, n.peerEpoch, n.state));
        voteSet = getVoteTracker(outofelection, new Vote(n.version, n.leader, n.zxid, n.electionEpoch, n.peerEpoch, n.state));
      
        if (voteSet.hasAllQuorums() && checkLeader(outofelection, n.leader, n.electionEpoch)) {
          synchronized (this) {
            logicalclock.set(n.electionEpoch);
            setPeerState(n.leader, voteSet);
          }
          Vote endVote = new Vote(n.leader, n.zxid, n.electionEpoch, n.peerEpoch);
          leaveInstance(endVote);
          return endVote;
        }
        break;

       

      We notice that when n.electionEpoch == logicalclock.get(), votes are being added into recvset, however checkLeader is called immediately afterwards with the votes in outofelection as can be seen here (https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/FastLeaderElection.java#L1037).

      Checking outofelection instead of recvset does not cause any problems.
      If checkLeader on outofelection fails, although it would have succeeded in recvset, checkLeader succeeds immediately afterwards when the vote is added in outofelection.
      Still, it seems natural to check for a leader in recvSet and not in outofelection

      Cheers,
      Karolos

       

      Attachments

        Issue Links

          Activity

            People

              karolos Karolos Antoniadis
              karolos Karolos Antoniadis
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 50m
                  1h 50m