ZooKeeper
  1. ZooKeeper
  2. ZOOKEEPER-975

new peer goes in LEADING state even if ensemble is online

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 3.3.2
    • Fix Version/s: 3.4.0
    • Component/s: None
    • Labels:
      None

      Description

      Scenario:
      1. 2 of the 3 ZK nodes are online
      2. Third node is attempting to join
      3. Third node unnecessarily goes in "LEADING" state
      4. Then third goes back to LOOKING (no majority of followers) and finally goes to FOLLOWING state.

      While going through the logs I noticed that a peer C that is trying to
      join an already formed cluster goes in LEADING state. This is because
      QuorumCnxManager of A and B sends the entire history of notification
      messages to C. C receives the notification messages that were
      exchanged between A and B when they were forming the cluster.

      In FastLeaderElection.lookForLeader(), due to the following piece of
      code, C quits lookForLeader assuming that it is supposed to lead.

      740 //If have received from all nodes, then terminate
      741 if ((self.getVotingView().size() == recvset.size()) &&
      742 (self.getQuorumVerifier().getWeight(proposedLeader) != 0))

      { 743 self.setPeerState((proposedLeader == self.getId()) ? 744 ServerState.LEADING: learningState()); 745 leaveInstance(); 746 return new Vote(proposedLeader, proposedZxid); 747 748 }

      else if (termPredicate(recvset,

      This can cause:
      1. C to unnecessarily go in LEADING state and wait for tickTime * initLimit and then restart the FLE.

      2. C waits for 200 ms (finalizeWait) and then considers whatever
      notifications it has received to make a decision. C could potentially
      decide to follow an old leader, fail to connect to the leader, and
      then restart FLE. See code below.

      752 if (termPredicate(recvset,
      753 new Vote(proposedLeader, proposedZxid,
      754 logicalclock))) {
      755
      756 // Verify if there is any change in the proposed leader
      757 while((n = recvqueue.poll(finalizeWait,
      758 TimeUnit.MILLISECONDS)) != null){
      759 if(totalOrderPredicate(n.leader, n.zxid,
      760 proposedLeader, proposedZxid))

      { 761 recvqueue.put(n); 762 break; 763 }

      764 }

      In general, this does not affect correctness of FLE since C will
      eventually go back to FOLLOWING state (A and B won't vote for
      C). However, this delays C from joining the cluster. This can in turn
      affect recovery time of an application.

      Proposal: A and B should send only the latest notification (most
      recent) instead of the entire history. Does this sound reasonable?

      1. ZOOKEEPER-975.patch
        24 kB
        Vishal Kher
      2. ZOOKEEPER-975.patch
        6 kB
        Vishal Kher
      3. ZOOKEEPER-975.patch2
        24 kB
        Vishal Kher
      4. ZOOKEEPER-975.patch3
        29 kB
        Vishal Kher
      5. ZOOKEEPER-975.patch4
        32 kB
        Vishal Kher
      6. ZOOKEEPER-975.patch5
        32 kB
        Vishal Kher
      7. ZOOKEEPER-975.patch6
        32 kB
        Vishal Kher

        Activity

        Hide
        Benjamin Reed added a comment -

        moved to 3.4

        Show
        Benjamin Reed added a comment - moved to 3.4
        Hide
        Flavio Junqueira added a comment -

        Thanks for bringing this up, Vishal. This is not a new observation, although I can't remember if we discussed it in a jira or not. In general, I'm lukewarm about this change. It is certainly not an issue to avoid the server going into LEADING before it goes correctly into LOOKING, but I'm not entirely comfortable with manipulating the queues of notifications. Being able to have two servers concurrently thinking they are leading is a situation supported by our protocols, and such a modification would be an optimization to avoid the unnecessary LEADING step.

        Regarding application recovery time, we don't have a load balance scheme at this point, which could be quite useful, so bringing a new follower up does not guarantee that clients will move their sessions to the new follower. Note that this situation occurs only if there is an ensemble running and a server joins or recovers.

        Show
        Flavio Junqueira added a comment - Thanks for bringing this up, Vishal. This is not a new observation, although I can't remember if we discussed it in a jira or not. In general, I'm lukewarm about this change. It is certainly not an issue to avoid the server going into LEADING before it goes correctly into LOOKING, but I'm not entirely comfortable with manipulating the queues of notifications. Being able to have two servers concurrently thinking they are leading is a situation supported by our protocols, and such a modification would be an optimization to avoid the unnecessary LEADING step. Regarding application recovery time, we don't have a load balance scheme at this point, which could be quite useful, so bringing a new follower up does not guarantee that clients will move their sessions to the new follower. Note that this situation occurs only if there is an ensemble running and a server joins or recovers.
        Hide
        Vishal Kher added a comment -

        Hi Flavio,

        What is the motivation to send the history of notifications to the
        joining peer? Shouldn't the most recent notification (or just the
        current state) be enough? I understand this
        is a performance issue. However, I think it is a sizeable hole.

        • There could have been multiple leader
          elections while the node is down and the node could end up hopping
          across leaders until it gets to the correct leader.
        • Suppose, we have a 3 node cluster. I have a simple client which
          connects to A and creates a znode_A to indicate that A (and the
          client) is online. The leader A disconnects from B and C and causes C
          to take leadership. Now, when A is trying to join the cluster, it can
          be unnecessarily delayed due to this bug. If I have an application that
          takes some action if znode_A is unavailable, then this bug can
          unnecessarily trigger that action. We are facing this problem in our application.

        I think it will be a small change to QCM. What do you think?

        Show
        Vishal Kher added a comment - Hi Flavio, What is the motivation to send the history of notifications to the joining peer? Shouldn't the most recent notification (or just the current state) be enough? I understand this is a performance issue. However, I think it is a sizeable hole. There could have been multiple leader elections while the node is down and the node could end up hopping across leaders until it gets to the correct leader. Suppose, we have a 3 node cluster. I have a simple client which connects to A and creates a znode_A to indicate that A (and the client) is online. The leader A disconnects from B and C and causes C to take leadership. Now, when A is trying to join the cluster, it can be unnecessarily delayed due to this bug. If I have an application that takes some action if znode_A is unavailable, then this bug can unnecessarily trigger that action. We are facing this problem in our application. I think it will be a small change to QCM. What do you think?
        Hide
        Vishal Kher added a comment -

        Hi Flavio,

        Heres a simple fix that I think does the trick. Let me know what you think.

        Show
        Vishal Kher added a comment - Hi Flavio, Heres a simple fix that I think does the trick. Let me know what you think.
        Hide
        Flavio Junqueira added a comment -

        Hi VIshal, The main problem I see is that the patch you propose removes lastMessageSent( ZOOKEEPER-481), which was inserted to deal with a problem observed in ZOOKEEPER-475, and also discussed in ZOOKEEPER-480.

        Show
        Flavio Junqueira added a comment - Hi VIshal, The main problem I see is that the patch you propose removes lastMessageSent( ZOOKEEPER-481 ), which was inserted to deal with a problem observed in ZOOKEEPER-475 , and also discussed in ZOOKEEPER-480 .
        Hide
        Vishal Kher added a comment -

        Hi Flavio,

        Do you think that this will be a problem even after we have the patch for ZOOKEEPER-932?

        This is what ZOOKEEPER-475 describes:
        ----------

        • Replica 1 sends a message to itself and to Replica 2 stating that its current
          vote is for replica 1;
        • Replica 2 sends a message to itself and to Replica 1 stating that its current
          vote is for replica 2;
        • Replica 1 updates its vote, and sends a message to itself stating that its
          current vote is for replica 2;
        • Since replica 1 has two votes for 2 in a an ensemble of 3 replicas, replica 1
          decides to follow 2.

        The problem is that replica 2 does not receive a message from 1 stating that it
        changed its vote to 2, which prevents 2 from becoming a leader. Now looking
        more carefully at why that happened, you can see that when 1 tries to send a
        message to 2, QuorumCnxManager in 1 is both shutting down a connection to 2 at
        the same time that it is trying to open a new one. The incorrect
        synchronization prevents the creation of a new connection, and 1 and 2 end up
        not connected.
        ----------

        We no longer have incorrect synchronization. We can have QCM in 1 shutting down the connection to 2 while it is trying to send a notification to 2. However, the only time 1 will shutdown a connection to 2 is when it receives a new connection request from 2 (or when something is wrong with the connection). A new connection request is received when 2 is trying to send a notification to 1. As a result, 1 will end up sending a notification to 2 saying that it is following 2. Do you agree?

        Show
        Vishal Kher added a comment - Hi Flavio, Do you think that this will be a problem even after we have the patch for ZOOKEEPER-932 ? This is what ZOOKEEPER-475 describes: ---------- Replica 1 sends a message to itself and to Replica 2 stating that its current vote is for replica 1; Replica 2 sends a message to itself and to Replica 1 stating that its current vote is for replica 2; Replica 1 updates its vote, and sends a message to itself stating that its current vote is for replica 2; Since replica 1 has two votes for 2 in a an ensemble of 3 replicas, replica 1 decides to follow 2. The problem is that replica 2 does not receive a message from 1 stating that it changed its vote to 2, which prevents 2 from becoming a leader. Now looking more carefully at why that happened, you can see that when 1 tries to send a message to 2, QuorumCnxManager in 1 is both shutting down a connection to 2 at the same time that it is trying to open a new one. The incorrect synchronization prevents the creation of a new connection, and 1 and 2 end up not connected. ---------- We no longer have incorrect synchronization. We can have QCM in 1 shutting down the connection to 2 while it is trying to send a notification to 2. However, the only time 1 will shutdown a connection to 2 is when it receives a new connection request from 2 (or when something is wrong with the connection). A new connection request is received when 2 is trying to send a notification to 1. As a result, 1 will end up sending a notification to 2 saying that it is following 2. Do you agree?
        Hide
        Vishal Kher added a comment -

        Hi Flavio,

        Can you describe to me this problem? I will see if the problem still exists if this patch is applied after applying patch for ZOOKEEPER-932. Thanks.

        Show
        Vishal Kher added a comment - Hi Flavio, Can you describe to me this problem? I will see if the problem still exists if this patch is applied after applying patch for ZOOKEEPER-932 . Thanks.
        Hide
        Flavio Junqueira added a comment -

        Hi Vishal, This is a general concurrency issue that arises when using TCP connections, and I'm not sure we can avoid it even though our protocol is in principle resilient to such message losses. Back to the TCP issue I was mentioning, we remove the message from the queue of messages to send, and send it with a write call to the socket. However, it is not guaranteed that the message will get through, since the destination might drop the connection while the message is in transit. The sender has no way to know if it really got through, unless we have an acknowledgment scheme on top of TCP, which sounds overkill. The way we found to avoid the case in one of the jiras I mentioned above was to resend the last message the peer dequeued for a given destination. This is exactly the mechanism you're proposing to remove.

        Show
        Flavio Junqueira added a comment - Hi Vishal, This is a general concurrency issue that arises when using TCP connections, and I'm not sure we can avoid it even though our protocol is in principle resilient to such message losses. Back to the TCP issue I was mentioning, we remove the message from the queue of messages to send, and send it with a write call to the socket. However, it is not guaranteed that the message will get through, since the destination might drop the connection while the message is in transit. The sender has no way to know if it really got through, unless we have an acknowledgment scheme on top of TCP, which sounds overkill. The way we found to avoid the case in one of the jiras I mentioned above was to resend the last message the peer dequeued for a given destination. This is exactly the mechanism you're proposing to remove.
        Hide
        Vishal Kher added a comment -

        Hi Flavio,

        ok, I will re-add lastMessage. How about before sending lastMessage() we check if the send queue is empty? If it is empty, then we send lastMesseage, otherwise, we send the message from the queue. This will avoid sending stale messages to the peer.

        So the final fix will:
        1. Change CAPACITY to 1.
        2. send lastMessage to A only if the send queue for A is empty.

        This will ensure that we send the most recent notification and we also handle lost message as before.

        Does this sound ok?

        Show
        Vishal Kher added a comment - Hi Flavio, ok, I will re-add lastMessage. How about before sending lastMessage() we check if the send queue is empty? If it is empty, then we send lastMesseage, otherwise, we send the message from the queue. This will avoid sending stale messages to the peer. So the final fix will: 1. Change CAPACITY to 1. 2. send lastMessage to A only if the send queue for A is empty. This will ensure that we send the most recent notification and we also handle lost message as before. Does this sound ok?
        Hide
        Flavio Junqueira added a comment -

        Hi Vishal, Sounds like a good way to patch it.

        Show
        Flavio Junqueira added a comment - Hi Vishal, Sounds like a good way to patch it.
        Hide
        Vishal Kher added a comment -

        Hi Flavio,

        Re-adding lastMessage results in the delay reported in this bug. When node 2 attempts to join a cluster of node 1 and 0, nodes 1 and 0 send a "LOOKING" notification followed by a "FOLLOWING"/"LEADING" notification. After receiving the first pair of LOOKING notifications, 2 goes in the LEADING state.

        I think I have a better idea. Instead of sending lastMessage, how about we send the notification for the current state of the peer (just like we do in FastLeaderElection.java)? I think this will resolve both the problems.

        Show
        Vishal Kher added a comment - Hi Flavio, Re-adding lastMessage results in the delay reported in this bug. When node 2 attempts to join a cluster of node 1 and 0, nodes 1 and 0 send a "LOOKING" notification followed by a "FOLLOWING"/"LEADING" notification. After receiving the first pair of LOOKING notifications, 2 goes in the LEADING state. I think I have a better idea. Instead of sending lastMessage, how about we send the notification for the current state of the peer (just like we do in FastLeaderElection.java)? I think this will resolve both the problems.
        Hide
        Vishal Kher added a comment -

        Hi Flavio,

        Do you agree that sending the current state is a better approach?

        I as looking at the code to add a method createNotificationMessage() that will return a notification message based on the current state of the server. QCM can call this method and send the notification message at the place where we send lastMessage. However, adding it to FastLeaderElection only wont be enough since there are other leader election algorithms as well. The Election interface is very minimal. Should we add this method to this interface? Which leader election algorithms do we support and is it ok to make this change only for the supported algorithms?

        Thanks
        -Vishal

        Show
        Vishal Kher added a comment - Hi Flavio, Do you agree that sending the current state is a better approach? I as looking at the code to add a method createNotificationMessage() that will return a notification message based on the current state of the server. QCM can call this method and send the notification message at the place where we send lastMessage. However, adding it to FastLeaderElection only wont be enough since there are other leader election algorithms as well. The Election interface is very minimal. Should we add this method to this interface? Which leader election algorithms do we support and is it ok to make this change only for the supported algorithms? Thanks -Vishal
        Hide
        Flavio Junqueira added a comment -

        Hi Vishal, I'm under the impression that getting the current state only works for LEADING or FOLLOWING. While looking, we would need to return the latest vote, yes? Overall, it is sound to return the current state instead of keeping the latest message.

        About maintenance, we have some time back talked about maintaining only the TCP version of FLE (FLE+QCM). There was never some real pressure to eliminate the others, and in fact previously some users were still using LE. I'm all for maintaining only FLE, but we need to hear the opinion of others.

        More thoughts?

        Show
        Flavio Junqueira added a comment - Hi Vishal, I'm under the impression that getting the current state only works for LEADING or FOLLOWING. While looking, we would need to return the latest vote, yes? Overall, it is sound to return the current state instead of keeping the latest message. About maintenance, we have some time back talked about maintaining only the TCP version of FLE (FLE+QCM). There was never some real pressure to eliminate the others, and in fact previously some users were still using LE. I'm all for maintaining only FLE, but we need to hear the opinion of others. More thoughts?
        Hide
        Vishal Kher added a comment -

        Hi Flavio,

        I have a patch for this, but I have it on the top of the fix for ZOOKEEPER-932. We have 932 applied to our ZK code since we need it. Until ZOOKEEPER-932 is reviewed and committed, I will have to keep back porting patches (and do double testing). I will port my changes to trunk if someone requires a fix for the bug. Since this is not a blocker, I am going to hold off the patch until 932 is reviewed. That will reduce my testing and porting overhead. Does that sound ok?

        The patch I have is good only for FLE.

        About maintenance, we have some time back talked about maintaining only the TCP version of FLE (FLE+QCM). There was never some real pressure to eliminate the others, and in fact previously some users were still using LE. I'm all for maintaining only FLE, but we need to hear the opinion of others. More thoughts?

        The documentation says: "The implementations of leader election 1 and 2 are currently not supported, and we have the intention of deprecating them in the near future. Implementations 0 and 3 are currently supported, and we plan to keep supporting them in the near future. To avoid having to support multiple versions of leader election unecessarily, we may eventually consider deprecating algorithm 0 as well, but we will plan according to the needs of the community."

        Is there a significant advantage of using LE 0 vs LE 3?

        Show
        Vishal Kher added a comment - Hi Flavio, I have a patch for this, but I have it on the top of the fix for ZOOKEEPER-932 . We have 932 applied to our ZK code since we need it. Until ZOOKEEPER-932 is reviewed and committed, I will have to keep back porting patches (and do double testing). I will port my changes to trunk if someone requires a fix for the bug. Since this is not a blocker, I am going to hold off the patch until 932 is reviewed. That will reduce my testing and porting overhead. Does that sound ok? The patch I have is good only for FLE. About maintenance, we have some time back talked about maintaining only the TCP version of FLE (FLE+QCM). There was never some real pressure to eliminate the others, and in fact previously some users were still using LE. I'm all for maintaining only FLE, but we need to hear the opinion of others. More thoughts? The documentation says: "The implementations of leader election 1 and 2 are currently not supported, and we have the intention of deprecating them in the near future. Implementations 0 and 3 are currently supported, and we plan to keep supporting them in the near future. To avoid having to support multiple versions of leader election unecessarily, we may eventually consider deprecating algorithm 0 as well, but we will plan according to the needs of the community." Is there a significant advantage of using LE 0 vs LE 3?
        Hide
        Vishal Kher added a comment -

        I am attaching the patch for review only. This patch is not for final
        submission.

        I have not written a test yet. I will add a test and resubmit the
        patch. This patch is well tested along with patch for 932. But I need
        need to do more testing after separating these changes from 932. All
        junit tests pass.

        Main changes done:
        FastLeaderElection.java

        • Removed erroneous exit conditions

        QuorumCnxManager.java

        • Restrict send queue size to 1 so that we don't
          send too many stale messages
        • Added functions to add to
          ArrayBlockingQueue without blocking while inserting in the
          queue if the queue is full.
        Show
        Vishal Kher added a comment - I am attaching the patch for review only. This patch is not for final submission. I have not written a test yet. I will add a test and resubmit the patch. This patch is well tested along with patch for 932. But I need need to do more testing after separating these changes from 932. All junit tests pass. Main changes done: FastLeaderElection.java Removed erroneous exit conditions QuorumCnxManager.java Restrict send queue size to 1 so that we don't send too many stale messages Added functions to add to ArrayBlockingQueue without blocking while inserting in the queue if the queue is full.
        Hide
        Vishal Kher added a comment -

        patch for review only.

        Show
        Vishal Kher added a comment - patch for review only.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12474451/ZOOKEEPER-975.patch
        against trunk revision 1082362.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        -1 patch. The patch command could not apply the patch.

        Console output: https://hudson.apache.org/hudson/job/PreCommit-ZOOKEEPER-Build/195//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12474451/ZOOKEEPER-975.patch against trunk revision 1082362. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://hudson.apache.org/hudson/job/PreCommit-ZOOKEEPER-Build/195//console This message is automatically generated.
        Hide
        Vishal Kher added a comment -

        Note there may be another problem in FLE unreleated to this bug. Originally, in QCM instead of sending lastMessage I wanted to send a notification with current state. But I noticed that a peer does not send its real current state in FastLeaderElection.java even when it is not LOOKING.
        Heres the relevant code:
        /*

        • If this server is not looking, but the one that sent the ack
        • is looking, then send back what it believes to be the leader.
          */
          Vote current = self.getCurrentVote();
          if(ackstate == QuorumPeer.ServerState.LOOKING)
          Unknown macro: { if(LOG.isDebugEnabled()){ LOG.debug("Sending new notification. My id = " + self.getId() + ", Recipient = " + response.sid + " zxid =" + current.zxid + " leader=" + current.id); } ToSend notmsg = new ToSend( ToSend.mType.notification, current.id, current.zxid,<============ zxid of current vote logicalclock, self.getPeerState(), response.sid); sendqueue.offer(notmsg); }

        It sends zxid of the current vote and not the current zxid seen by the server. The zxid of the vote can be stale. So I dropped that change from QCM changed the size of sendQueue to 1. I think fixing the above code is a separate issue and should be done later, if needed.

        Show
        Vishal Kher added a comment - Note there may be another problem in FLE unreleated to this bug. Originally, in QCM instead of sending lastMessage I wanted to send a notification with current state. But I noticed that a peer does not send its real current state in FastLeaderElection.java even when it is not LOOKING. Heres the relevant code: /* If this server is not looking, but the one that sent the ack is looking, then send back what it believes to be the leader. */ Vote current = self.getCurrentVote(); if(ackstate == QuorumPeer.ServerState.LOOKING) Unknown macro: { if(LOG.isDebugEnabled()){ LOG.debug("Sending new notification. My id = " + self.getId() + ", Recipient = " + response.sid + " zxid =" + current.zxid + " leader=" + current.id); } ToSend notmsg = new ToSend( ToSend.mType.notification, current.id, current.zxid,<============ zxid of current vote logicalclock, self.getPeerState(), response.sid); sendqueue.offer(notmsg); } It sends zxid of the current vote and not the current zxid seen by the server. The zxid of the vote can be stale. So I dropped that change from QCM changed the size of sendQueue to 1. I think fixing the above code is a separate issue and should be done later, if needed.
        Hide
        Flavio Junqueira added a comment -

        Hi Vishal, I'll have a look at the patch once you generate a new one that applies.

        On the last comment you posted, I wonder why you think it would be better to update the last zxid according to the state of the server. One problem I see is that only the leader can really maintain its own last zxid up to date. The other servers don't really know how far the leader has gone.

        Also, the idea of maintaining the last vote intact is to keep the pair used to decide upon the current leader, and currently we don't use the zxid field to determine leadership if the notification says LEADING or FOLLOWING. Is there anything I'm missing?

        Show
        Flavio Junqueira added a comment - Hi Vishal, I'll have a look at the patch once you generate a new one that applies. On the last comment you posted, I wonder why you think it would be better to update the last zxid according to the state of the server. One problem I see is that only the leader can really maintain its own last zxid up to date. The other servers don't really know how far the leader has gone. Also, the idea of maintaining the last vote intact is to keep the pair used to decide upon the current leader, and currently we don't use the zxid field to determine leadership if the notification says LEADING or FOLLOWING. Is there anything I'm missing?
        Hide
        Vishal Kher added a comment -

        patch failed.

        Show
        Vishal Kher added a comment - patch failed.
        Hide
        Vishal Kher added a comment -

        Attaching patch on trunk for review./comments.

        Show
        Vishal Kher added a comment - Attaching patch on trunk for review./comments.
        Hide
        Vishal Kher added a comment -

        Submitting patch on trunk.

        Show
        Vishal Kher added a comment - Submitting patch on trunk.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12474564/ZOOKEEPER-975.patch2
        against trunk revision 1082362.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://hudson.apache.org/hudson/job/PreCommit-ZOOKEEPER-Build/199//testReport/
        Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-ZOOKEEPER-Build/199//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Console output: https://hudson.apache.org/hudson/job/PreCommit-ZOOKEEPER-Build/199//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12474564/ZOOKEEPER-975.patch2 against trunk revision 1082362. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://hudson.apache.org/hudson/job/PreCommit-ZOOKEEPER-Build/199//testReport/ Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-ZOOKEEPER-Build/199//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://hudson.apache.org/hudson/job/PreCommit-ZOOKEEPER-Build/199//console This message is automatically generated.
        Hide
        Flavio Junqueira added a comment -

        Hi Vishal, Here are two high-level comments:

        1. It does not have a test, but I'm not entirely convinced we should try to implement one, since it might be complex. We should think about it, though;
        2. I don't understand why you're removing the outofelection data structure. I believe the notifications from peers that are FOLLOWING/LEADING should be treated separately from the the ones of peers that are LOOKING. If a peer obtains notifications from a quorum saying that the peers are LEADING/FOLLOWING, then it should follow the leader they point to. If a peer obtains notifications from a quorum of FOLLOWING peers, then it should follow the protocol to select a leader. Consequently, the notifications should be treated separately.
        Show
        Flavio Junqueira added a comment - Hi Vishal, Here are two high-level comments: It does not have a test, but I'm not entirely convinced we should try to implement one, since it might be complex. We should think about it, though; I don't understand why you're removing the outofelection data structure. I believe the notifications from peers that are FOLLOWING/LEADING should be treated separately from the the ones of peers that are LOOKING. If a peer obtains notifications from a quorum saying that the peers are LEADING/FOLLOWING, then it should follow the leader they point to. If a peer obtains notifications from a quorum of FOLLOWING peers, then it should follow the protocol to select a leader. Consequently, the notifications should be treated separately.
        Hide
        Vishal Kher added a comment -

        1. It does not have a test, but I'm not entirely convinced we should try to implement one, since it might be complex. We should think about it, though;

        Yes, I didn't include a test. We should have a test, but I don't think I can get to it in the next couple of days or so. I was thinking of feeding the recvQueue with dummy Notifications for scenarios that we wanted to test and then call FastLeaderElection.lookForLeader() and verify the outcome. The tricky part is to come-up with the right sequence of Notifications to test all corner cases.

        2. I don't understand why you're removing the outofelection data structure
        . I believe the notifications from peers that are FOLLOWING/LEADING should be treated separately from the the ones of peers that are LOOKING. If a peer obtains notifications from a quorum saying that the peers are LEADING/FOLLOWING, then it should follow the leader they point to. If a peer obtains notifications from a quorum of FOLLOWING peers, then it should follow the protocol to select a leader. Consequently, the notifications should be treated separately.

        I was sure about that change either. I don't mind reintroducing outofelection. But I couldn't think of scenarios where inserting Notification in recvset instead of outofelection will be a problem. Considering that termPredicate verifies that majority vote for same <epoch, sid, zxid>, can you describe a scenario where this change could cause a problem?

        Show
        Vishal Kher added a comment - 1. It does not have a test, but I'm not entirely convinced we should try to implement one, since it might be complex. We should think about it, though; Yes, I didn't include a test. We should have a test, but I don't think I can get to it in the next couple of days or so. I was thinking of feeding the recvQueue with dummy Notifications for scenarios that we wanted to test and then call FastLeaderElection.lookForLeader() and verify the outcome. The tricky part is to come-up with the right sequence of Notifications to test all corner cases. 2. I don't understand why you're removing the outofelection data structure . I believe the notifications from peers that are FOLLOWING/LEADING should be treated separately from the the ones of peers that are LOOKING. If a peer obtains notifications from a quorum saying that the peers are LEADING/FOLLOWING, then it should follow the leader they point to. If a peer obtains notifications from a quorum of FOLLOWING peers, then it should follow the protocol to select a leader. Consequently, the notifications should be treated separately. I was sure about that change either. I don't mind reintroducing outofelection. But I couldn't think of scenarios where inserting Notification in recvset instead of outofelection will be a problem. Considering that termPredicate verifies that majority vote for same <epoch, sid, zxid>, can you describe a scenario where this change could cause a problem?
        Hide
        Vishal Kher added a comment -

        correction: "I wasn't sure about that change either"

        Show
        Vishal Kher added a comment - correction: "I wasn't sure about that change either"
        Hide
        Flavio Junqueira added a comment -

        The problem I see with mixing notifications from LOOKING peers and FOLLOWING/LEADING peers is the following. Peers that are either LEADING or FOLLOWING won't change their leader state based on new notifications. If a peer receives notifications from a quorum and one of the notifications says FOLLOWING/LEADING, then the peer cannot count on the sender of the notification to move to a new leader. The election will fail and the peer (perhaps along with others) will have to start over.

        Also, in the case a peer gets disconnected and comes back, if there is a quorum following a leader when the peer comes back, it needs to be able to determine who the leader is even if its leader election round is higher than the others. In this case, it has to follow a leader based on FOLLOWING/LEADING notifications, without comparing the votes in the notifications.

        For these reasons, we have kept FOLLOWING/LEADING notifications in a separate data structure. It is true, though, that we could keep all of them in the same structure and simply filter based on the state field when processing.

        Show
        Flavio Junqueira added a comment - The problem I see with mixing notifications from LOOKING peers and FOLLOWING/LEADING peers is the following. Peers that are either LEADING or FOLLOWING won't change their leader state based on new notifications. If a peer receives notifications from a quorum and one of the notifications says FOLLOWING/LEADING, then the peer cannot count on the sender of the notification to move to a new leader. The election will fail and the peer (perhaps along with others) will have to start over. Also, in the case a peer gets disconnected and comes back, if there is a quorum following a leader when the peer comes back, it needs to be able to determine who the leader is even if its leader election round is higher than the others. In this case, it has to follow a leader based on FOLLOWING/LEADING notifications, without comparing the votes in the notifications. For these reasons, we have kept FOLLOWING/LEADING notifications in a separate data structure. It is true, though, that we could keep all of them in the same structure and simply filter based on the state field when processing.
        Hide
        Vishal Kher added a comment -

        Thanks for the clarification. I will reintroduce separate data structure for LEADING/FOLLOWING notifications.

        Show
        Vishal Kher added a comment - Thanks for the clarification. I will reintroduce separate data structure for LEADING/FOLLOWING notifications.
        Hide
        Vishal Kher added a comment -

        Hi Flavio,

        I think a peer should check whether the received notification originated from a cluster member. This check is done for LOOKING notifications, but after modifying proposedLeader, xid and epoch. The verification is not performed for leading/following.

        I am planning to add this check at the very beginning:

        else if(self.getVotingView().containsKey(n.sid)) {
        switch (n.state)

        { case LOOKING: [...] }

        Do you see any issues with this (especially while using OBSERVER_ID)?

        Show
        Vishal Kher added a comment - Hi Flavio, I think a peer should check whether the received notification originated from a cluster member. This check is done for LOOKING notifications, but after modifying proposedLeader, xid and epoch. The verification is not performed for leading/following. I am planning to add this check at the very beginning: else if(self.getVotingView().containsKey(n.sid)) { switch (n.state) { case LOOKING: [...] } Do you see any issues with this (especially while using OBSERVER_ID)?
        Hide
        Vishal Kher added a comment -
        • added outofelection
        • small additional change to drop notifications that do not come peers in voting view.

        Testing done:
        junit
        systest
        rebooted peers several times

        Show
        Vishal Kher added a comment - added outofelection small additional change to drop notifications that do not come peers in voting view. Testing done: junit systest rebooted peers several times
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12474906/ZOOKEEPER-975.patch3
        against trunk revision 1082362.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://hudson.apache.org/hudson/job/PreCommit-ZOOKEEPER-Build/209//testReport/
        Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-ZOOKEEPER-Build/209//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Console output: https://hudson.apache.org/hudson/job/PreCommit-ZOOKEEPER-Build/209//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12474906/ZOOKEEPER-975.patch3 against trunk revision 1082362. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://hudson.apache.org/hudson/job/PreCommit-ZOOKEEPER-Build/209//testReport/ Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-ZOOKEEPER-Build/209//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://hudson.apache.org/hudson/job/PreCommit-ZOOKEEPER-Build/209//console This message is automatically generated.
        Hide
        Flavio Junqueira added a comment -

        Hi Vishal, In the run() method of WorkReceiver we perform a check already:

                                /*
                                 * If it is from an observer, respond right away.
                                 * Note that the following predicate assumes that
                                 * if a server is not a follower, then it must be
                                 * an observer. If we ever have any other type of
                                 * learner in the future, we'll have to change the
                                 * way we check for observers.
                                 */
                                if(!self.getVotingView().containsKey(response.sid)){
                                    Vote current = self.getCurrentVote();
                                    ToSend notmsg = new ToSend(ToSend.mType.notification,
                                            current.id,
                                            current.zxid,
                                            logicalclock,
                                            self.getPeerState(),
                                            response.sid);
        
                                    sendqueue.offer(notmsg);
        

        I'll been reviewing your patch shortly.

        Show
        Flavio Junqueira added a comment - Hi Vishal, In the run() method of WorkReceiver we perform a check already: /* * If it is from an observer, respond right away. * Note that the following predicate assumes that * if a server is not a follower, then it must be * an observer. If we ever have any other type of * learner in the future, we'll have to change the * way we check for observers. */ if(!self.getVotingView().containsKey(response.sid)){ Vote current = self.getCurrentVote(); ToSend notmsg = new ToSend(ToSend.mType.notification, current.id, current.zxid, logicalclock, self.getPeerState(), response.sid); sendqueue.offer(notmsg); I'll been reviewing your patch shortly.
        Hide
        Flavio Junqueira added a comment -

        Hi Vishal, Apart from not having a test, it is +1 for me. Looks good.

        Show
        Flavio Junqueira added a comment - Hi Vishal, Apart from not having a test, it is +1 for me. Looks good.
        Hide
        Vishal Kher added a comment -

        Hi Flavio,

        Do you think we need a test for this? I was looking through the code to see how we can write a test. What we can do is insert notifications in recvqueue for a peer, then call lookForLeader(), and monitor the state/proposdZxid/proposedLeader/ect. This will let us feed whatever notifications we want to FLE. The other peers should just ignore the notifications (or send notifications that we want them to send).

        However, for this we will have to make changes to FastLeaderElection so that one can overload its Messenger, modify recvqueue, set proposedLeader, propsedZxid, etc from the test. I think this will be a good change in general so that we can feed notifications to a peer and test for corner cases, but a bit time consuming. I am not sure how much that will help for this particular bug though. What do you think?

        -Vishal

        Show
        Vishal Kher added a comment - Hi Flavio, Do you think we need a test for this? I was looking through the code to see how we can write a test. What we can do is insert notifications in recvqueue for a peer, then call lookForLeader(), and monitor the state/proposdZxid/proposedLeader/ect. This will let us feed whatever notifications we want to FLE. The other peers should just ignore the notifications (or send notifications that we want them to send). However, for this we will have to make changes to FastLeaderElection so that one can overload its Messenger, modify recvqueue, set proposedLeader, propsedZxid, etc from the test. I think this will be a good change in general so that we can feed notifications to a peer and test for corner cases, but a bit time consuming. I am not sure how much that will help for this particular bug though. What do you think? -Vishal
        Hide
        Flavio Junqueira added a comment -

        Hi Vishal, What if we have a test with 3 servers with ids 1, 2, and 3. We start 1 and 2, and let 2 be elected. Once 2 is elected, we start 3. With the current trunk code, 3 should declare itself LEADING, since it will receive the initial notifications of the other two processes. With your patch, this situation shouldn't happen.

        What do you think?

        Show
        Flavio Junqueira added a comment - Hi Vishal, What if we have a test with 3 servers with ids 1, 2, and 3. We start 1 and 2, and let 2 be elected. Once 2 is elected, we start 3. With the current trunk code, 3 should declare itself LEADING, since it will receive the initial notifications of the other two processes. With your patch, this situation shouldn't happen. What do you think?
        Hide
        Vishal Kher added a comment -

        Attaching patch with test.

        Show
        Vishal Kher added a comment - Attaching patch with test.
        Hide
        Vishal Kher added a comment -

        patch + test

        Show
        Vishal Kher added a comment - patch + test
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12476190/ZOOKEEPER-975.patch4
        against trunk revision 1091314.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 6 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        -1 javac. The patch appears to cause tar ant target to fail.

        -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://hudson.apache.org/hudson/job/PreCommit-ZOOKEEPER-Build/225//testReport/
        Console output: https://hudson.apache.org/hudson/job/PreCommit-ZOOKEEPER-Build/225//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12476190/ZOOKEEPER-975.patch4 against trunk revision 1091314. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The patch appears to cause tar ant target to fail. -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://hudson.apache.org/hudson/job/PreCommit-ZOOKEEPER-Build/225//testReport/ Console output: https://hudson.apache.org/hudson/job/PreCommit-ZOOKEEPER-Build/225//console This message is automatically generated.
        Hide
        Vishal Kher added a comment -

        don't know what happened there. Canceling patch.

        Show
        Vishal Kher added a comment - don't know what happened there. Canceling patch.
        Hide
        Vishal Kher added a comment -

        Resubmitting the patch. Verifed the patch works on the trunk and all unit tests pass.

        Show
        Vishal Kher added a comment - Resubmitting the patch. Verifed the patch works on the trunk and all unit tests pass.
        Hide
        Vishal Kher added a comment -

        patch available on trunk

        Show
        Vishal Kher added a comment - patch available on trunk
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12476255/ZOOKEEPER-975.patch5
        against trunk revision 1091841.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 6 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://hudson.apache.org/hudson/job/PreCommit-ZOOKEEPER-Build/226//testReport/
        Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-ZOOKEEPER-Build/226//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Console output: https://hudson.apache.org/hudson/job/PreCommit-ZOOKEEPER-Build/226//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12476255/ZOOKEEPER-975.patch5 against trunk revision 1091841. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://hudson.apache.org/hudson/job/PreCommit-ZOOKEEPER-Build/226//testReport/ Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-ZOOKEEPER-Build/226//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://hudson.apache.org/hudson/job/PreCommit-ZOOKEEPER-Build/226//console This message is automatically generated.
        Hide
        Flavio Junqueira added a comment -

        Hi Vishal, The patch is +1 for me, expect for the findbugs warning it is generating. There are two findbugs reports, but I think only one is due to your patch:

        JLM 	Synchronization performed on java.util.concurrent.ArrayBlockingQueue in org.apache.zookeeper.server.quorum.QuorumCnxManager.addToRecvQueue(QuorumCnxManager$Message
        

        It sounds like you'd have to synchronize the block in addToRecvQueue using another object to get findbugs not to complain.

        Show
        Flavio Junqueira added a comment - Hi Vishal, The patch is +1 for me, expect for the findbugs warning it is generating. There are two findbugs reports, but I think only one is due to your patch: JLM Synchronization performed on java.util.concurrent.ArrayBlockingQueue in org.apache.zookeeper.server.quorum.QuorumCnxManager.addToRecvQueue(QuorumCnxManager$Message It sounds like you'd have to synchronize the block in addToRecvQueue using another object to get findbugs not to complain.
        Hide
        Vishal Kher added a comment -

        fixed findbugs warning

        Show
        Vishal Kher added a comment - fixed findbugs warning
        Hide
        Vishal Kher added a comment -

        patch for trunk with fix for findbugs

        Show
        Vishal Kher added a comment - patch for trunk with fix for findbugs
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12477322/ZOOKEEPER-975.patch6
        against trunk revision 1091841.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 6 new or modified tests.

        -1 javadoc. The javadoc tool appears to have generated 1 warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/hudson/job/PreCommit-ZOOKEEPER-Build/232//testReport/
        Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-ZOOKEEPER-Build/232//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/hudson/job/PreCommit-ZOOKEEPER-Build/232//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12477322/ZOOKEEPER-975.patch6 against trunk revision 1091841. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. -1 javadoc. The javadoc tool appears to have generated 1 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/hudson/job/PreCommit-ZOOKEEPER-Build/232//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-ZOOKEEPER-Build/232//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-ZOOKEEPER-Build/232//console This message is automatically generated.
        Hide
        Vishal Kher added a comment -

        the findbugs failure here is due to ZOOKEEPER-1052 as pointed out by Flavio.

        Show
        Vishal Kher added a comment - the findbugs failure here is due to ZOOKEEPER-1052 as pointed out by Flavio.
        Hide
        Flavio Junqueira added a comment -

        Does anyone know why we are getting this javadoc warning? ZOOKEEPER-1052 also got -1 on javadoc, and I'm not sure what the problem is. Any hint would be welcome.

        Show
        Flavio Junqueira added a comment - Does anyone know why we are getting this javadoc warning? ZOOKEEPER-1052 also got -1 on javadoc, and I'm not sure what the problem is. Any hint would be welcome.
        Hide
        Flavio Junqueira added a comment -

        The problem is not with the patch according to the console output:

        [javadoc] javadoc: warning - Error fetching URL: http://java.sun.com/javase/6/docs/api/package-list
        

        so it is +1 for me. If no one has anything against it, I'll commit it later.

        Thanks, Vishal.

        Show
        Flavio Junqueira added a comment - The problem is not with the patch according to the console output: [javadoc] javadoc: warning - Error fetching URL: http://java.sun.com/javase/6/docs/api/package-list so it is +1 for me. If no one has anything against it, I'll commit it later. Thanks, Vishal.
        Hide
        Flavio Junqueira added a comment -

        Committed revision 1097865.

        Thanks again, Vishal.

        Show
        Flavio Junqueira added a comment - Committed revision 1097865. Thanks again, Vishal.
        Hide
        Hudson added a comment -

        Integrated in ZooKeeper-trunk #1168 (See https://builds.apache.org/hudson/job/ZooKeeper-trunk/1168/)
        ZOOKEEPER-975. new peer goes in LEADING state even if ensemble is online. (vishal via fpj)

        Show
        Hudson added a comment - Integrated in ZooKeeper-trunk #1168 (See https://builds.apache.org/hudson/job/ZooKeeper-trunk/1168/ ) ZOOKEEPER-975 . new peer goes in LEADING state even if ensemble is online. (vishal via fpj)

          People

          • Assignee:
            Vishal Kher
            Reporter:
            Vishal Kher
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development