ZooKeeper
  1. ZooKeeper
  2. ZOOKEEPER-335

zookeeper servers should commit the new leader txn to their logs.

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 3.1.0
    • Fix Version/s: 3.4.0
    • Component/s: server
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      currently the zookeeper followers do not commit the new leader election. This will cause problems in a failure scenarios with a follower acking to the same leader txn id twice, which might be two different intermittent leaders and allowing them to propose two different txn's of the same zxid.

      1. ZOOKEEPER-790.travis.log.bz2
        44 kB
        Travis Crawford
      2. ZOOKEEPER-335.patch
        55 kB
        Benjamin Reed
      3. ZOOKEEPER-335_5.patch
        58 kB
        Benjamin Reed
      4. ZOOKEEPER-335_4.patch
        58 kB
        Benjamin Reed
      5. ZOOKEEPER-335_3.patch
        58 kB
        Benjamin Reed
      6. ZOOKEEPER-335_2.patch
        56 kB
        Benjamin Reed
      7. zklogs.tar.gz
        75 kB
        Vishal Kher
      8. zk.log.gz
        5 kB
        Mike Solomon
      9. faultynode-vishal.txt
        114 kB
        Vishal Kher

        Issue Links

        There are no Sub-Tasks for this issue.

          Activity

          Hide
          Mahadev konar added a comment -

          to fix this issue we require server -server protocol change. Thsi protocol change will break backwards compatibility. To maintain backwards compatibility the code becomes quite complex and tricky. Instead of making a last minute change and having to do al lthe testing to check if backwards compatibily for servers is maintained, I am moving it to 3.3 to see if we want to fix it in a backwards compatible manner or fix it in 4.0 and break backwards compatibility.

          Show
          Mahadev konar added a comment - to fix this issue we require server -server protocol change. Thsi protocol change will break backwards compatibility. To maintain backwards compatibility the code becomes quite complex and tricky. Instead of making a last minute change and having to do al lthe testing to check if backwards compatibily for servers is maintained, I am moving it to 3.3 to see if we want to fix it in a backwards compatible manner or fix it in 4.0 and break backwards compatibility.
          Hide
          Mahadev konar added a comment -

          given the change is non trivial as mentioned above, I am moving this to 3.4 for now.

          Show
          Mahadev konar added a comment - given the change is non trivial as mentioned above, I am moving this to 3.4 for now.
          Hide
          Patrick Hunt added a comment -

          Raising to blocker level - we've seen this reported by users a couple times now.

          Show
          Patrick Hunt added a comment - Raising to blocker level - we've seen this reported by users a couple times now.
          Hide
          Charity Majors added a comment -

          I ran into this bug this morning, but it also seemed to put my cluster into an unusable state. The cluster stopped accepting all connections, until I restarted node one. After node one departed the cluster, nodes two and three formed a quorum and started serving again. Node one was unable to rejoin, and had this error:

          2010-06-02 17:04:56,486 - FATAL [QuorumPeer:/0:0:0:0:0:0:0:0:2181:Follower@71] - Leader epoch a is less than our epoch b
          2010-06-02 17:04:56,486 - WARN [QuorumPeer:/0:0:0:0:0:0:0:0:2181:Follower@82] - Exception when following the leader
          java.io.IOException: Error: Epoch of leader is lower

          until I cleared the data directory and restarted again.

          Show
          Charity Majors added a comment - I ran into this bug this morning, but it also seemed to put my cluster into an unusable state. The cluster stopped accepting all connections, until I restarted node one. After node one departed the cluster, nodes two and three formed a quorum and started serving again. Node one was unable to rejoin, and had this error: 2010-06-02 17:04:56,486 - FATAL [QuorumPeer:/0:0:0:0:0:0:0:0:2181:Follower@71] - Leader epoch a is less than our epoch b 2010-06-02 17:04:56,486 - WARN [QuorumPeer:/0:0:0:0:0:0:0:0:2181:Follower@82] - Exception when following the leader java.io.IOException: Error: Epoch of leader is lower until I cleared the data directory and restarted again.
          Hide
          Vishal Kher added a comment -

          Hi,

          We are running into this bug very often (almost 60-75% hit rate) while testing our newly developed application over ZK.
          This is almost a blocker for us. Will the fix be simplified if backward compatibility was not an issue?

          Thanks.

          Show
          Vishal Kher added a comment - Hi, We are running into this bug very often (almost 60-75% hit rate) while testing our newly developed application over ZK. This is almost a blocker for us. Will the fix be simplified if backward compatibility was not an issue? Thanks.
          Hide
          Patrick Hunt added a comment -

          We are unable to reproduce this issue. If you can provide the server logs (all servers) and attach them to this jira it would be very helpful. Some detail on the approx time of the issue so we can correlate to the logs would help too (summary of what you did/do to cause it, etc... anything that might help us nail this one down).

          Detail on ZK version, OS, Java version, HW info, etc... would also be of use to us.

          Show
          Patrick Hunt added a comment - We are unable to reproduce this issue. If you can provide the server logs (all servers) and attach them to this jira it would be very helpful. Some detail on the approx time of the issue so we can correlate to the logs would help too (summary of what you did/do to cause it, etc... anything that might help us nail this one down). Detail on ZK version, OS, Java version, HW info, etc... would also be of use to us.
          Hide
          Mike Solomon added a comment -

          I am having this exact issue, but I am not upgrading. I am merely restarting the cluster.

          I have a cluster of three. I took down host1 and verified that my application remained and reconnected to host2 and host3.

          With host1 back online, I took down host2. I noticed that the java process was spinning over 100% CPU and realized it had not come back up.

          This is running the 3.3.0 JAR release on a dual proc, quad-core Intel box. I'm running SuSE 10.3, 64-bit, with this version of java:

          java version "1.6.0_10"
          Java(TM) SE Runtime Environment (build 1.6.0_10-b33)
          Java HotSpot(TM) Server VM (build 11.0-b15, mixed mode)

          I will attach a log file.

          Show
          Mike Solomon added a comment - I am having this exact issue, but I am not upgrading. I am merely restarting the cluster. I have a cluster of three. I took down host1 and verified that my application remained and reconnected to host2 and host3. With host1 back online, I took down host2. I noticed that the java process was spinning over 100% CPU and realized it had not come back up. This is running the 3.3.0 JAR release on a dual proc, quad-core Intel box. I'm running SuSE 10.3, 64-bit, with this version of java: java version "1.6.0_10" Java(TM) SE Runtime Environment (build 1.6.0_10-b33) Java HotSpot(TM) Server VM (build 11.0-b15, mixed mode) I will attach a log file.
          Hide
          Mike Solomon added a comment -

          This is the log file of my second attempt to restart a wedged zookeeper server.

          Show
          Mike Solomon added a comment - This is the log file of my second attempt to restart a wedged zookeeper server.
          Hide
          Patrick Hunt added a comment -

          Thanks for the log Mike. This issue does seem similar to what Charity reported:

          2010-06-17 14:35:34,263 - FATAL [QuorumPeer:/0:0:0:0:0:0:0:0:2181:Follower@71] - Leader epoch 1 is less than our epoch 2

          Unfortunately the attached log shows information only after the problem occurred. Any chance you could upload the logs during the initial event? (what I mean is when the problem originally started) Also the logs from the other servers in the ensemble (again, at the time that the problem originally occurred) would really help. Thanks.

          Have you been able to clear the problem? It's fairly straightforward to resolve - Charity resolved by; 1) bring down the failing server, 2) clear the data directory of that server (only), 3) start that server. You only want to do this for the server that's unable to rejoin the quorum - ie the one thats outputting "Leader epoch 1 is less than our epoch 2", not for all servers in the ensemble.

          Show
          Patrick Hunt added a comment - Thanks for the log Mike. This issue does seem similar to what Charity reported: 2010-06-17 14:35:34,263 - FATAL [QuorumPeer:/0:0:0:0:0:0:0:0:2181:Follower@71] - Leader epoch 1 is less than our epoch 2 Unfortunately the attached log shows information only after the problem occurred. Any chance you could upload the logs during the initial event? (what I mean is when the problem originally started) Also the logs from the other servers in the ensemble (again, at the time that the problem originally occurred) would really help. Thanks. Have you been able to clear the problem? It's fairly straightforward to resolve - Charity resolved by; 1) bring down the failing server, 2) clear the data directory of that server (only), 3) start that server. You only want to do this for the server that's unable to rejoin the quorum - ie the one thats outputting "Leader epoch 1 is less than our epoch 2", not for all servers in the ensemble.
          Hide
          Flavio Junqueira added a comment -

          Mike, There is one thing I don't understand. From the logs, it looks like servers 1 and 3 are proposing a zxid of 0 (second field of notification) during election, which makes me think that they had no state at all:

          2010-06-17 14:35:40,714 - INFO  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:FastLeaderElection@689] - Notification: 2, 8589934884, 2, 2, LOOKING, LOOKING, 2
          2010-06-17 14:35:40,714 - INFO  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:FastLeaderElection@799] - Notification: 3, 0, 1, 2, LOOKING, FOLLOWING, 1
          2010-06-17 14:35:40,714 - INFO  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:FastLeaderElection@799] - Notification: 3, 0, 1, 2, LOOKING, LEADING, 3
          

          Server 2 on the other hand had accepted updates based on the zxid it proposes. Were they supposed to have no state at all? Have you deleted your logs and snapshots before restarting the servers?

          Show
          Flavio Junqueira added a comment - Mike, There is one thing I don't understand. From the logs, it looks like servers 1 and 3 are proposing a zxid of 0 (second field of notification) during election, which makes me think that they had no state at all: 2010-06-17 14:35:40,714 - INFO [QuorumPeer:/0:0:0:0:0:0:0:0:2181:FastLeaderElection@689] - Notification: 2, 8589934884, 2, 2, LOOKING, LOOKING, 2 2010-06-17 14:35:40,714 - INFO [QuorumPeer:/0:0:0:0:0:0:0:0:2181:FastLeaderElection@799] - Notification: 3, 0, 1, 2, LOOKING, FOLLOWING, 1 2010-06-17 14:35:40,714 - INFO [QuorumPeer:/0:0:0:0:0:0:0:0:2181:FastLeaderElection@799] - Notification: 3, 0, 1, 2, LOOKING, LEADING, 3 Server 2 on the other hand had accepted updates based on the zxid it proposes. Were they supposed to have no state at all? Have you deleted your logs and snapshots before restarting the servers?
          Hide
          Patrick Hunt added a comment -

          Side note: Flavio can you enter a JIRA to fix this log message, it would really be helpful to have attribute names in the log message do simplify debugging. (general review of quorum logging would be great, fix stuff like this and ensure that enough information is provided to debug - add logging?) Perhaps coordinate with Ivan, I think he was trying to change things a bit to enable more log analysis support in the "loggrapher" contrib.

          Show
          Patrick Hunt added a comment - Side note: Flavio can you enter a JIRA to fix this log message, it would really be helpful to have attribute names in the log message do simplify debugging. (general review of quorum logging would be great, fix stuff like this and ensure that enough information is provided to debug - add logging?) Perhaps coordinate with Ivan, I think he was trying to change things a bit to enable more log analysis support in the "loggrapher" contrib.
          Hide
          Vishal Kher added a comment -

          Hi,

          I have attached the logs. The log entries are similar to those reported by others.

          We are testing with 3 nodes. Each node is run in a VM running SLES 11. All 3 VMs are run on the same host. VMs are sharing the same disk. cpuinfo and meminfo for VM is in the attached file.
          I have also tried to collect more info with -verbose:gc -Xloggc:/../ -Xprof options to java.

          gc.lg contains the gc output
          rest of the info should be in msgs.log
          Default java heap size was used. java version "1.6.0_18" was used.

          One point to note - In my case, on the misbehaving node could not joing the cluster. Rest of the cluster was stable (except for the flood of log messages on the leader because the misbehaving follower kept terminating session).

          Hope this helps.
          Thanks.

          Show
          Vishal Kher added a comment - Hi, I have attached the logs. The log entries are similar to those reported by others. We are testing with 3 nodes. Each node is run in a VM running SLES 11. All 3 VMs are run on the same host. VMs are sharing the same disk. cpuinfo and meminfo for VM is in the attached file. I have also tried to collect more info with -verbose:gc -Xloggc:/../ -Xprof options to java. gc.lg contains the gc output rest of the info should be in msgs.log Default java heap size was used. java version "1.6.0_18" was used. One point to note - In my case, on the misbehaving node could not joing the cluster. Rest of the cluster was stable (except for the flood of log messages on the leader because the misbehaving follower kept terminating session). Hope this helps. Thanks.
          Hide
          Flavio Junqueira added a comment -

          Guys, I don't see enough information in these logs to determine what's going on. Let me tell you what I'm seeing so that perhaps other folks can help me out here.

          One part of the log that is suspicious is this one:

          =6693 [QuorumPeer:/0.0.0.0:2181] WARN  org.apache.zookeeper.server.quorum.Learner  - Got zxid 0x300000001 expected 0x1
          =6693 [QuorumPeer:/0.0.0.0:2181] WARN  org.apache.zookeeper.server.quorum.Learner  - Got zxid 0x300000001 expected 0x1
          [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor30]
          [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor27]
          [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor22]
          [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor23]
          [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor18]
          [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor20]
          [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor19]
          [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor31]
          [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor21]
          [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor26]
          [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor25]
          [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor33]
          [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor29]
          [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor28]
          [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor24]
          [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor32]
          
          ************* NODE RESTARTED HERE **********************
          

          Before being restarted, the bad node receives a proposal with zxid <3,1> and it expects <0,1>. Next in the logs after being restarted, I can see that it is complaining that it has epoch 4 and the leader 3. Something strange apparently happened during the restart. It also seems to be the case that the node was being able to talk to the others (first entries in the log before the excerpt above).

          Do you guys see anything I'm overlooking?

          Show
          Flavio Junqueira added a comment - Guys, I don't see enough information in these logs to determine what's going on. Let me tell you what I'm seeing so that perhaps other folks can help me out here. One part of the log that is suspicious is this one: =6693 [QuorumPeer:/0.0.0.0:2181] WARN org.apache.zookeeper.server.quorum.Learner - Got zxid 0x300000001 expected 0x1 =6693 [QuorumPeer:/0.0.0.0:2181] WARN org.apache.zookeeper.server.quorum.Learner - Got zxid 0x300000001 expected 0x1 [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor30] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor27] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor22] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor23] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor18] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor20] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor19] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor31] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor21] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor26] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor25] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor33] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor29] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor28] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor24] [Unloading class sun.reflect.GeneratedSerializationConstructorAccessor32] ************* NODE RESTARTED HERE ********************** Before being restarted, the bad node receives a proposal with zxid <3,1> and it expects <0,1>. Next in the logs after being restarted, I can see that it is complaining that it has epoch 4 and the leader 3. Something strange apparently happened during the restart. It also seems to be the case that the node was being able to talk to the others (first entries in the log before the excerpt above). Do you guys see anything I'm overlooking?
          Hide
          Patrick Hunt added a comment -

          vishal comment on list:

          I might be wrong here, but let me try to chip in my few cents.

          I think the problem is in LearnerHandler.java at the leader fo this
          Follower.

          /* see what other packets from the proposal

          • and tobeapplied queues need to be sent
          • and then decide if we can just send a DIFF
          • or we actually need to send the whole snapshot
            */
            long leaderLastZxid = leader.startForwarding(this, updates);
            ---> this leaderLastZxid returned is probably incorrect.
            // a special case when both the ids are the same
            if (peerLastZxid == leaderLastZxid) { packetToSend = Leader.DIFF; zxidToSend = leaderLastZxid; }

          QuorumPacket newLeaderQP = new QuorumPacket(Leader.NEWLEADER,
          leaderLastZxid, null, null);
          oa.writeRecord(newLeaderQP, "packet");
          bufferedOutput.flush()

          Show
          Patrick Hunt added a comment - vishal comment on list: I might be wrong here, but let me try to chip in my few cents. I think the problem is in LearnerHandler.java at the leader fo this Follower. /* see what other packets from the proposal and tobeapplied queues need to be sent and then decide if we can just send a DIFF or we actually need to send the whole snapshot */ long leaderLastZxid = leader.startForwarding(this, updates); ---> this leaderLastZxid returned is probably incorrect. // a special case when both the ids are the same if (peerLastZxid == leaderLastZxid) { packetToSend = Leader.DIFF; zxidToSend = leaderLastZxid; } QuorumPacket newLeaderQP = new QuorumPacket(Leader.NEWLEADER, leaderLastZxid, null, null); oa.writeRecord(newLeaderQP, "packet"); bufferedOutput.flush()
          Hide
          Patrick Hunt added a comment -

          vishal comment on list:

          Nevermind. I am on the wrong track. Flavio's earlier mail did clarify that
          the follower received the epoch before restart.

          Show
          Patrick Hunt added a comment - vishal comment on list: Nevermind. I am on the wrong track. Flavio's earlier mail did clarify that the follower received the epoch before restart.
          Hide
          Patrick Hunt added a comment -

          Vishal comment on list:

          Hi Flavio,

          I have 3 set of logs and they all seem to indicate two problems on the
          misbehaving follower:

          Problem 1: Expected zxid is incorrect
          =0 [QuorumPeer:/0.0.0.0:2181] WARN
          org.apache.zookeeper.server.quorum.Learner - Got zxid 0x300000002 expected
          0x1
          =0 [QuorumPeer:/0.0.0.0:2181] WARN
          org.apache.zookeeper.server.quorum.Learner - Got zxid 0x300000002 expected
          0x1
          =2495 [QuorumPeer:/0.0.0.0:2181] WARN
          org.apache.zookeeper.server.quorum.Learner - Got zxid 0x400000001 expected
          0x1
          =2495 [QuorumPeer:/0.0.0.0:2181] WARN
          org.apache.zookeeper.server.quorum.Learner - Got zxid 0x400000001 expected
          0x1
          =191617 [QuorumPeer:/0.0.0.0:2181] WARN
          org.apache.zookeeper.server.quorum.Learner - Got zxid 0x500000001 expected
          0x1
          =191617 [QuorumPeer:/0.0.0.0:2181] WARN
          org.apache.zookeeper.server.quorum.Learner - Got zxid 0x500000001 expected
          0x1
          =0 [QuorumPeer:/0.0.0.0:2181] WARN
          org.apache.zookeeper.server.quorum.Learner - Got zxid 0x600000001 expected
          0x1
          =0 [QuorumPeer:/0.0.0.0:2181] WARN
          org.apache.zookeeper.server.quorum.Learner - Got zxid 0x600000001 expected
          0x1
          =245016 [QuorumPeer:/0.0.0.0:2181] WARN
          org.apache.zookeeper.server.quorum.Learner - Got zxid 0x700000001 expected
          0x1
          =245016 [QuorumPeer:/0.0.0.0:2181] WARN
          org.apache.zookeeper.server.quorum.Learner - Got zxid 0x700000001 expected
          0x1

          Note expected zxid is always 0x1 (lastQueued is always 0?)

          Problem 2: While joining the cluster expected epoch is 1 higher than seen
          earlier
          =14991 [QuorumPeer:/0.0.0.0:2181] FATAL
          org.apache.zookeeper.server.quorum.Learner - Leader epoch 7 is less than
          our epoch 8

          -Vishal

          Show
          Patrick Hunt added a comment - Vishal comment on list: Hi Flavio, I have 3 set of logs and they all seem to indicate two problems on the misbehaving follower: Problem 1: Expected zxid is incorrect =0 [QuorumPeer:/0.0.0.0:2181] WARN org.apache.zookeeper.server.quorum.Learner - Got zxid 0x300000002 expected 0x1 =0 [QuorumPeer:/0.0.0.0:2181] WARN org.apache.zookeeper.server.quorum.Learner - Got zxid 0x300000002 expected 0x1 =2495 [QuorumPeer:/0.0.0.0:2181] WARN org.apache.zookeeper.server.quorum.Learner - Got zxid 0x400000001 expected 0x1 =2495 [QuorumPeer:/0.0.0.0:2181] WARN org.apache.zookeeper.server.quorum.Learner - Got zxid 0x400000001 expected 0x1 =191617 [QuorumPeer:/0.0.0.0:2181] WARN org.apache.zookeeper.server.quorum.Learner - Got zxid 0x500000001 expected 0x1 =191617 [QuorumPeer:/0.0.0.0:2181] WARN org.apache.zookeeper.server.quorum.Learner - Got zxid 0x500000001 expected 0x1 =0 [QuorumPeer:/0.0.0.0:2181] WARN org.apache.zookeeper.server.quorum.Learner - Got zxid 0x600000001 expected 0x1 =0 [QuorumPeer:/0.0.0.0:2181] WARN org.apache.zookeeper.server.quorum.Learner - Got zxid 0x600000001 expected 0x1 =245016 [QuorumPeer:/0.0.0.0:2181] WARN org.apache.zookeeper.server.quorum.Learner - Got zxid 0x700000001 expected 0x1 =245016 [QuorumPeer:/0.0.0.0:2181] WARN org.apache.zookeeper.server.quorum.Learner - Got zxid 0x700000001 expected 0x1 Note expected zxid is always 0x1 (lastQueued is always 0?) Problem 2: While joining the cluster expected epoch is 1 higher than seen earlier =14991 [QuorumPeer:/0.0.0.0:2181] FATAL org.apache.zookeeper.server.quorum.Learner - Leader epoch 7 is less than our epoch 8 -Vishal
          Hide
          Vishal Kher added a comment -

          Hi,

          I enabled tracing and did some more debugging. Looks like the restarted peer (and trying to join the cluster) determines that it is a leader and increments its epoch. However, rest of the nodes don't acknowledge this node as the leader, and hence, have an older epoch. I will attache the log. Unfortunately, I don't have traces from other nodes. I will repeat the experiment later and attache logs from other nodes.

          Scenario:

          • Form a 3 node cluster. This is not just ZK cluster. It also involves our application cluster that uses ZK.
          • Kill one of the follower
          • After a minute or so restart follower
          • Follower rejects leader with "Leader epoch y is less than our epoch y + 1"

          From logs:

          a) Peer X restarts and starts leader election.
          a) For a small window of time, X thinks that it is the new leader! During this window, for some reason, rest of the nodes tell X that they are also trying to find a leader. I.e., all 3 nodes are in LOOKING state. After seeing that all 3 nodes are in LOOKING state, X decides to be a leader?

          155 2010-06-20 23:22:46,421 - DEBUG [WorkerSender Thread:QuorumCnxManager@346] - Opening channel to server 1
          156 2010-06-20 23:22:46,423 - DEBUG [WorkerReceiver Thread:FastLeaderElection$Messenger$WorkerReceiver@214] - Receive new notification message. My id = 0
          157 2010-06-20 23:22:46,424 - INFO [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@689] - Notification: 0, 77309411393, 1, 0, LOOKING, LOOKING, 0
          158 2010-06-20 23:22:46,424 - DEBUG [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@495] - id: 0, proposed id: 0, zxid: 77309411393, proposed zxid: 77309411393
          159 2010-06-20 23:22:46,424 - DEBUG [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@717] - Adding vote: From = 0, Proposed leader = 0, Porposed zxid = 77309411393, Proposed epoch = 1
          160 2010-06-20 23:22:46,426 - INFO [WorkerSender Thread:QuorumCnxManager@162] - Have smaller server identifier, so dropping the connection: (1, 0)
          161 2010-06-20 23:22:46,426 - DEBUG [WorkerSender Thread:QuorumCnxManager@346] - Opening channel to server 2
          162 2010-06-20 23:22:46,427 - DEBUG [Thread-1:QuorumCnxManager$Listener@445] - Connection request /192.168.1.182:46701
          163 2010-06-20 23:22:46,427 - DEBUG [Thread-1:QuorumCnxManager$Listener@448] - Connection request: 0
          164 2010-06-20 23:22:46,428 - DEBUG [Thread-1:QuorumCnxManager$SendWorker@504] - Address of remote peer: 1
          165 2010-06-20 23:22:46,428 - INFO [WorkerSender Thread:QuorumCnxManager@162] - Have smaller server identifier, so dropping the connection: (2, 0)
          166 2010-06-20 23:22:46,431 - DEBUG [WorkerReceiver Thread:FastLeaderElection$Messenger$WorkerReceiver@214] - Receive new notification message. My id = 0
          167 2010-06-20 23:22:46,432 - INFO [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@689] - Notification: 1, 77309411372, 1, 0, LOOKING, LOOKING, 1
          168 2010-06-20 23:22:46,432 - DEBUG [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@495] - id: 1, proposed id: 0, zxid: 77309411372, proposed zxid: 77309411393
          169 2010-06-20 23:22:46,432 - DEBUG [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@717] - Adding vote: From = 1, Proposed leader = 1, Porposed zxid = 77309411372, Proposed epoch = 1
          170 2010-06-20 23:22:46,436 - DEBUG [Thread-1:QuorumCnxManager$Listener@445] - Connection request /192.168.1.183:44310
          171 2010-06-20 23:22:46,436 - DEBUG [Thread-1:QuorumCnxManager$Listener@448] - Connection request: 0
          172 2010-06-20 23:22:46,436 - DEBUG [Thread-1:QuorumCnxManager$SendWorker@504] - Address of remote peer: 2
          173 2010-06-20 23:22:46,440 - DEBUG [WorkerReceiver Thread:FastLeaderElection$Messenger$WorkerReceiver@214] - Receive new notification message. My id = 0
          174 2010-06-20 23:22:46,440 - INFO [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@689] - Notification: 2, 73014444097, 1, 0, LOOKING, LOOKING, 2
          175 2010-06-20 23:22:46,440 - DEBUG [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@495] - id: 2, proposed id: 0, zxid: 73014444097, proposed zxid: 77309411393
          176 2010-06-20 23:22:46,441 - DEBUG [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@717] - Adding vote: From = 2, Proposed leader = 2, Porposed zxid = 73014444097, Proposed epoch = 1
          177 2010-06-20 23:22:46,441 - INFO [QuorumPeer:/0.0.0.0:2181:QuorumPeer@647] - LEADING

          b) As a result X increments its epoch. Worse, since this node decided to be a leader, it starts doing transactions. The first set of transactions start removing all ephemeral nodes. But these transactions are only done locally. Other peers do not ack these transactions since they know that this peer is not the leader.

          c) After a few seconds (8 secs), X relinquishes leadership since it does not receive any ack from rest of the peers
          d) It starts leader election again. This time, rests of the peers respond correctly informing X that a leader already exists.
          e) X determines that it should now follow the leader and starts handshake with the leader. However, X has incremented its epoch in the mean time. X thinks that the epoch sent by the leader during handshake is older than the epoch it has noticed before. X aborts connection with the leader.

          Show
          Vishal Kher added a comment - Hi, I enabled tracing and did some more debugging. Looks like the restarted peer (and trying to join the cluster) determines that it is a leader and increments its epoch. However, rest of the nodes don't acknowledge this node as the leader, and hence, have an older epoch. I will attache the log. Unfortunately, I don't have traces from other nodes. I will repeat the experiment later and attache logs from other nodes. Scenario: Form a 3 node cluster. This is not just ZK cluster. It also involves our application cluster that uses ZK. Kill one of the follower After a minute or so restart follower Follower rejects leader with "Leader epoch y is less than our epoch y + 1" From logs: a) Peer X restarts and starts leader election. a) For a small window of time, X thinks that it is the new leader! During this window, for some reason, rest of the nodes tell X that they are also trying to find a leader. I.e., all 3 nodes are in LOOKING state. After seeing that all 3 nodes are in LOOKING state, X decides to be a leader? 155 2010-06-20 23:22:46,421 - DEBUG [WorkerSender Thread:QuorumCnxManager@346] - Opening channel to server 1 156 2010-06-20 23:22:46,423 - DEBUG [WorkerReceiver Thread:FastLeaderElection$Messenger$WorkerReceiver@214] - Receive new notification message. My id = 0 157 2010-06-20 23:22:46,424 - INFO [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@689] - Notification: 0, 77309411393, 1, 0, LOOKING, LOOKING, 0 158 2010-06-20 23:22:46,424 - DEBUG [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@495] - id: 0, proposed id: 0, zxid: 77309411393, proposed zxid: 77309411393 159 2010-06-20 23:22:46,424 - DEBUG [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@717] - Adding vote: From = 0, Proposed leader = 0, Porposed zxid = 77309411393, Proposed epoch = 1 160 2010-06-20 23:22:46,426 - INFO [WorkerSender Thread:QuorumCnxManager@162] - Have smaller server identifier, so dropping the connection: (1, 0) 161 2010-06-20 23:22:46,426 - DEBUG [WorkerSender Thread:QuorumCnxManager@346] - Opening channel to server 2 162 2010-06-20 23:22:46,427 - DEBUG [Thread-1:QuorumCnxManager$Listener@445] - Connection request /192.168.1.182:46701 163 2010-06-20 23:22:46,427 - DEBUG [Thread-1:QuorumCnxManager$Listener@448] - Connection request: 0 164 2010-06-20 23:22:46,428 - DEBUG [Thread-1:QuorumCnxManager$SendWorker@504] - Address of remote peer: 1 165 2010-06-20 23:22:46,428 - INFO [WorkerSender Thread:QuorumCnxManager@162] - Have smaller server identifier, so dropping the connection: (2, 0) 166 2010-06-20 23:22:46,431 - DEBUG [WorkerReceiver Thread:FastLeaderElection$Messenger$WorkerReceiver@214] - Receive new notification message. My id = 0 167 2010-06-20 23:22:46,432 - INFO [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@689] - Notification: 1, 77309411372, 1, 0, LOOKING, LOOKING, 1 168 2010-06-20 23:22:46,432 - DEBUG [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@495] - id: 1, proposed id: 0, zxid: 77309411372, proposed zxid: 77309411393 169 2010-06-20 23:22:46,432 - DEBUG [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@717] - Adding vote: From = 1, Proposed leader = 1, Porposed zxid = 77309411372, Proposed epoch = 1 170 2010-06-20 23:22:46,436 - DEBUG [Thread-1:QuorumCnxManager$Listener@445] - Connection request /192.168.1.183:44310 171 2010-06-20 23:22:46,436 - DEBUG [Thread-1:QuorumCnxManager$Listener@448] - Connection request: 0 172 2010-06-20 23:22:46,436 - DEBUG [Thread-1:QuorumCnxManager$SendWorker@504] - Address of remote peer: 2 173 2010-06-20 23:22:46,440 - DEBUG [WorkerReceiver Thread:FastLeaderElection$Messenger$WorkerReceiver@214] - Receive new notification message. My id = 0 174 2010-06-20 23:22:46,440 - INFO [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@689] - Notification: 2, 73014444097, 1, 0, LOOKING, LOOKING, 2 175 2010-06-20 23:22:46,440 - DEBUG [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@495] - id: 2, proposed id: 0, zxid: 73014444097, proposed zxid: 77309411393 176 2010-06-20 23:22:46,441 - DEBUG [QuorumPeer:/0.0.0.0:2181:FastLeaderElection@717] - Adding vote: From = 2, Proposed leader = 2, Porposed zxid = 73014444097, Proposed epoch = 1 177 2010-06-20 23:22:46,441 - INFO [QuorumPeer:/0.0.0.0:2181:QuorumPeer@647] - LEADING b) As a result X increments its epoch. Worse, since this node decided to be a leader, it starts doing transactions. The first set of transactions start removing all ephemeral nodes. But these transactions are only done locally. Other peers do not ack these transactions since they know that this peer is not the leader. c) After a few seconds (8 secs), X relinquishes leadership since it does not receive any ack from rest of the peers d) It starts leader election again. This time, rests of the peers respond correctly informing X that a leader already exists. e) X determines that it should now follow the leader and starts handshake with the leader. However, X has incremented its epoch in the mean time. X thinks that the epoch sent by the leader during handshake is older than the epoch it has noticed before. X aborts connection with the leader.
          Hide
          Vishal Kher added a comment -

          Apologies for multiple attachments.

          Show
          Vishal Kher added a comment - Apologies for multiple attachments.
          Hide
          Flavio Junqueira added a comment -

          Thanks for detailed assessment, Vishal. In Step b, the fact that the process believes it is the leader is not a problem, and it happens because we queue notification messages during leader election.

          The real issue is that leader code is setting the last processed zxid to the first of the new epoch even before connecting to a quorum of followers. Because the leader code sets this value before connecting to a quorum of followers (Leader.java:281) and the follower code throws an IOException (Follower.java:73) if the leader epoch is smaller, we have that when the false leader drops leadership and becomes a follower, it finds a smaller epoch and kills itself.

          I noticed that this follower check was not there before (not present in 3.0 branch), and it might have been introduced when we did the observer reorganization. For now I propose that we move line Leader.java:281 to Leader.java:470. It simply changes the point in which we set the last processed zxid to one in which we know that a quorum of followers supports the leader. I reasoned a bit about it and verified that tests pass.

          A patch for the change I'm proposing is trivial, but a unit test will require some work, so I'd rather hear opinions first. Also, please note that this problem is not related to the topic of this jira, so we might consider working on a different jira from this point on.

          Show
          Flavio Junqueira added a comment - Thanks for detailed assessment, Vishal. In Step b, the fact that the process believes it is the leader is not a problem, and it happens because we queue notification messages during leader election. The real issue is that leader code is setting the last processed zxid to the first of the new epoch even before connecting to a quorum of followers. Because the leader code sets this value before connecting to a quorum of followers (Leader.java:281) and the follower code throws an IOException (Follower.java:73) if the leader epoch is smaller, we have that when the false leader drops leadership and becomes a follower, it finds a smaller epoch and kills itself. I noticed that this follower check was not there before (not present in 3.0 branch), and it might have been introduced when we did the observer reorganization. For now I propose that we move line Leader.java:281 to Leader.java:470. It simply changes the point in which we set the last processed zxid to one in which we know that a quorum of followers supports the leader. I reasoned a bit about it and verified that tests pass. A patch for the change I'm proposing is trivial, but a unit test will require some work, so I'd rather hear opinions first. Also, please note that this problem is not related to the topic of this jira, so we might consider working on a different jira from this point on.
          Hide
          Patrick Hunt added a comment -

          Vishal, if Flavio provides you with a patch could you apply it and verify with your configuration?

          Flavio, please provide an initial patch that people could use to verify. We'll hold off on a release until you add the test(s), but this would be great to start with.

          Thanks all for helping to track this down!

          I'd like to fast track a 3.3.2 release, so if possible please make this a priority.

          Show
          Patrick Hunt added a comment - Vishal, if Flavio provides you with a patch could you apply it and verify with your configuration? Flavio, please provide an initial patch that people could use to verify. We'll hold off on a release until you add the test(s), but this would be great to start with. Thanks all for helping to track this down! I'd like to fast track a 3.3.2 release, so if possible please make this a priority.
          Hide
          Flavio Junqueira added a comment -

          I have created a new jira for this issue: ZOOKEEPER-790. There is a patch there.

          Show
          Flavio Junqueira added a comment - I have created a new jira for this issue: ZOOKEEPER-790 . There is a patch there.
          Hide
          Vishal Kher added a comment -

          I will try out the patch. FYI I am using 3.3.0.

          Show
          Vishal Kher added a comment - I will try out the patch. FYI I am using 3.3.0.
          Hide
          Travis Crawford added a comment -

          Unfortunately I still observed the "Leader epoch" issue and needed to manually force a leader election for the cluster to recover. This test was performed with the following base+patches, applied in the order listed.

          Zookeeper 3.3.1
          ZOOKEEPER-744
          ZOOKEEPER-790

          2010-07-15 02:43:57,181 - INFO  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:FileSnap@82] - Reading snapshot /data/zookeeper/version-2/snapshot.2300001ac2
          2010-07-15 02:43:57,384 - INFO  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:FastLeaderElection@649] - New election. My id =  1, Proposed zxid = 154618826848
          2010-07-15 02:43:57,385 - INFO  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:FastLeaderElection@689] - Notification: 1, 154618826848, 4, 1, LOOKING, LOOKING, 1
          2010-07-15 02:43:57,385 - INFO  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:FastLeaderElection@799] - Notification: 2, 146030952153, 3, 1, LOOKING, LEADING, 2
          2010-07-15 02:43:57,385 - INFO  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:FastLeaderElection@799] - Notification: 2, 146030952153, 3, 1, LOOKING, FOLLOWING, 3
          2010-07-15 02:43:57,385 - INFO  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:QuorumPeer@642] - FOLLOWING
          2010-07-15 02:43:57,385 - INFO  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:ZooKeeperServer@151] - Created server with tickTime 2000 minSessionTimeout 4000 maxSessionTimeout 40000 datadir /data/zookeeper/txlog/version-2 snapdir /data/zookeeper/version-2
          2010-07-15 02:43:57,387 - FATAL [QuorumPeer:/0:0:0:0:0:0:0:0:2181:Follower@71] - Leader epoch 23 is less than our epoch 24
          2010-07-15 02:43:57,387 - WARN  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:Follower@82] - Exception when following the leader 
          java.io.IOException: Error: Epoch of leader is lower
                  at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:73)
                  at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:644)
          2010-07-15 02:43:57,387 - INFO  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:Follower@166] - shutdown called 
          java.lang.Exception: shutdown Follower
                  at org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)
                  at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:648)
          

          I followed the recipe @vishal provided for recreating.

          (a) Stop one follower in a three node cluster
          (b) Get some tea while it falls behind
          (c) Start the node stopped in (a).

          These timestamps show where the follower was stopped. It also shows when it was turned back on.

          2010-07-15 02:35:36,398 - INFO  [QuorumPeer:/0:0:0:0:0:0:0:0:2181:NIOServerCnxn@1661] - Established session 0x229aa13cfc6276b with negotiated timeout 10000 for client /10.209.45.114:34562
          2010-07-15 02:39:18,907 - INFO  [main:QuorumPeerConfig@90] - Reading configuration from: /etc/zookeeper/conf/zoo.cfg
          

          This timestamp is the first ``Leader epoch`` line. Everything between these two points will be the interesting bits.

          2010-07-15 02:39:43,339 - FATAL [QuorumPeer:/0:0:0:0:0:0:0:0:2181:Follower@71] - Leader epoch 23 is less than our epoch 24
          
          Show
          Travis Crawford added a comment - Unfortunately I still observed the "Leader epoch" issue and needed to manually force a leader election for the cluster to recover. This test was performed with the following base+patches, applied in the order listed. Zookeeper 3.3.1 ZOOKEEPER-744 ZOOKEEPER-790 2010-07-15 02:43:57,181 - INFO [QuorumPeer:/0:0:0:0:0:0:0:0:2181:FileSnap@82] - Reading snapshot /data/zookeeper/version-2/snapshot.2300001ac2 2010-07-15 02:43:57,384 - INFO [QuorumPeer:/0:0:0:0:0:0:0:0:2181:FastLeaderElection@649] - New election. My id = 1, Proposed zxid = 154618826848 2010-07-15 02:43:57,385 - INFO [QuorumPeer:/0:0:0:0:0:0:0:0:2181:FastLeaderElection@689] - Notification: 1, 154618826848, 4, 1, LOOKING, LOOKING, 1 2010-07-15 02:43:57,385 - INFO [QuorumPeer:/0:0:0:0:0:0:0:0:2181:FastLeaderElection@799] - Notification: 2, 146030952153, 3, 1, LOOKING, LEADING, 2 2010-07-15 02:43:57,385 - INFO [QuorumPeer:/0:0:0:0:0:0:0:0:2181:FastLeaderElection@799] - Notification: 2, 146030952153, 3, 1, LOOKING, FOLLOWING, 3 2010-07-15 02:43:57,385 - INFO [QuorumPeer:/0:0:0:0:0:0:0:0:2181:QuorumPeer@642] - FOLLOWING 2010-07-15 02:43:57,385 - INFO [QuorumPeer:/0:0:0:0:0:0:0:0:2181:ZooKeeperServer@151] - Created server with tickTime 2000 minSessionTimeout 4000 maxSessionTimeout 40000 datadir /data/zookeeper/txlog/version-2 snapdir /data/zookeeper/version-2 2010-07-15 02:43:57,387 - FATAL [QuorumPeer:/0:0:0:0:0:0:0:0:2181:Follower@71] - Leader epoch 23 is less than our epoch 24 2010-07-15 02:43:57,387 - WARN [QuorumPeer:/0:0:0:0:0:0:0:0:2181:Follower@82] - Exception when following the leader java.io.IOException: Error: Epoch of leader is lower at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:73) at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:644) 2010-07-15 02:43:57,387 - INFO [QuorumPeer:/0:0:0:0:0:0:0:0:2181:Follower@166] - shutdown called java.lang.Exception: shutdown Follower at org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166) at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:648) I followed the recipe @vishal provided for recreating. (a) Stop one follower in a three node cluster (b) Get some tea while it falls behind (c) Start the node stopped in (a). These timestamps show where the follower was stopped. It also shows when it was turned back on. 2010-07-15 02:35:36,398 - INFO [QuorumPeer:/0:0:0:0:0:0:0:0:2181:NIOServerCnxn@1661] - Established session 0x229aa13cfc6276b with negotiated timeout 10000 for client /10.209.45.114:34562 2010-07-15 02:39:18,907 - INFO [main:QuorumPeerConfig@90] - Reading configuration from: /etc/zookeeper/conf/zoo.cfg This timestamp is the first ``Leader epoch`` line. Everything between these two points will be the interesting bits. 2010-07-15 02:39:43,339 - FATAL [QuorumPeer:/0:0:0:0:0:0:0:0:2181:Follower@71] - Leader epoch 23 is less than our epoch 24
          Hide
          Travis Crawford added a comment -

          Please see my most recent comment for a summary of where the interesting lines are in this log file.

          Show
          Travis Crawford added a comment - Please see my most recent comment for a summary of where the interesting lines are in this log file.
          Hide
          Radu Marin added a comment -

          We are hitting this problem often. A follower starts infinite looping on this, and it does not accept clients anymore.

          It says LOOKING, than FOLLOWING, than "Leader epoch y is less than our epoch y +1 ", throws exception in shutdown and loops again.

          The only way I could make it work again it was to cleanup the data directory and restart the server.

          Any change this will be fixed soon ?

          Show
          Radu Marin added a comment - We are hitting this problem often. A follower starts infinite looping on this, and it does not accept clients anymore. It says LOOKING, than FOLLOWING, than "Leader epoch y is less than our epoch y +1 ", throws exception in shutdown and loops again. The only way I could make it work again it was to cleanup the data directory and restart the server. Any change this will be fixed soon ?
          Hide
          Flavio Junqueira added a comment -

          Radu, It sounds like the problem you mention has been resolved in ZOOKEEPER-790. I'm not sure which version you're using, but perhaps you should consider moving to 3.3.2.

          Show
          Flavio Junqueira added a comment - Radu, It sounds like the problem you mention has been resolved in ZOOKEEPER-790 . I'm not sure which version you're using, but perhaps you should consider moving to 3.3.2.
          Hide
          Vishal Kher added a comment -

          to fix this issue we require server -server protocol change. Thsi protocol change will break backwards compatibility. To maintain backwards compatibility the code becomes quite complex and tricky. Instead of making a last minute change and having to do all the testing to check if backwards compatibily for servers is maintained, I am moving it to 3.3 to see if we want to fix it in a backwards compatible manner or fix it in 4.0 and break backwards compatibility.

          Until the bug is fixed, will it be possible to avoid running into this scenario by doing the following:
          1. Create a znode /dummy
          2. After receiving a SyncConnected write some data to /dummy
          3. Then proceed with other write transactions

          Essentially, do a NULL transaction from the client. Will this prevent the log divergence to affect divergence of ordering of client transactions?

          Show
          Vishal Kher added a comment - to fix this issue we require server -server protocol change. Thsi protocol change will break backwards compatibility. To maintain backwards compatibility the code becomes quite complex and tricky. Instead of making a last minute change and having to do all the testing to check if backwards compatibily for servers is maintained, I am moving it to 3.3 to see if we want to fix it in a backwards compatible manner or fix it in 4.0 and break backwards compatibility. Until the bug is fixed, will it be possible to avoid running into this scenario by doing the following: 1. Create a znode /dummy 2. After receiving a SyncConnected write some data to /dummy 3. Then proceed with other write transactions Essentially, do a NULL transaction from the client. Will this prevent the log divergence to affect divergence of ordering of client transactions?
          Hide
          Vishal Kher added a comment -

          Folks,

          Can someone comment if the work around that I mentioned in my previous comment is valid?

          Thanks.
          -Vishal

          Show
          Vishal Kher added a comment - Folks, Can someone comment if the work around that I mentioned in my previous comment is valid? Thanks. -Vishal
          Hide
          Vishal Kher added a comment -

          Hi,

          I think we are running into this bug where the leader dies after
          recording the transaction locally but before sending it out to the followers.

          This is causing log divergence, which is causing the client connected
          to the old leader (after reboot) to see different
          data. Unfortunately, I don't have the transaction logs to look
          into. But based on the ZooKeeper logs I am strongly suspecting that we
          are running into this bug. I have given the description below. Please
          let me know if you think I am on the wrong track.

          We have a 3 node cluster. IP addresses are *.53, *.54 and *.55.

          1) 54 is the leader and is rebooted by our test.

          2) Node 55 becomes ZK leader
          2011-05-30 22:56:45.797 INFO - [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:QuorumPeer@662] - LEADING

          The next transaction it will do will be zxid 0x0x5D00000001. We can see this from the zxid below:

          2011-05-30 22:56:46.819 INFO - [LearnerHandler[myid=2]-/192.168.4.1:33329:Leader@503] - Have quorum of supporters; starting up and setting last processed zxid: 399431958528

          3) Our client on 55 attempts to create a znode. This is the first
          transaction after 55 becomes the leader. The following application
          message tells us that is writing to ZK.

          2011-05-30 22:56:52.608 INFO - commitTransaction - Committing transaction 21068:

          This will translate to ZooKeeper transaction ID 0x5D00000001. Now, at
          the same instant 53 was rebooted by the test.

          Now 54 and 53 are both down, 55 loses quorum and is rebooted by our watchdog.
          2011-05-30 22:56:54.922 WARN - [LearnerHandler[myid=2]-/192.168.4.1:33329:LearnerHandler@475] - ******* GOODBYE /192.168.4.1:33329 ********
          2011-05-30 22:56:55.916 INFO - [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:Leader@396] - Shutdown called
          java.lang.Exception: shutdown Leader! reason: Only 0 followers, need 1

          At this step, I will assume that 53 logged transaction ID 0x5D00000001
          and 53 went offline before receiving this transaction from 55. I am
          not able to verify this since I don't have the transaction logs. But I
          think this assumption is valid and can be inferred from the
          description below.

          4) Now, 53 and 54 reboot. 53 becomes leader. Since neither 53 and 54
          have heard of 55's 0x5D00000001 transaction, 53 also decides to
          start doing the transaction from 0x5D00000001.
          Corresponding log entry from 53:
          2011-05-30 22:58:10.116 INFO - [LearnerHandler[myid=1]-/192.168.4.1:52419:Leader@503] - Have quorum of supporters; starting up and setting last processed zxid: 399431958528

          This is where 55 and rest of the cluster will differ in their
          states. Same zxid but different content.

          5) 55 reboots and rejoins the cluster. It tells 54 that it has
          transaction 0x5D00000001 logged. 54 sends it a diff of transactions
          starting from 0x5D00000002.

          We can see that from the following entry in 55's log:
          2011-05-30 22:58:24.677 WARN - [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:Learner@341] - Got zxid 0x5d00000002 expected 0x1

          Thus, old transaction 0x5D00000001 remains in 55 logs. As a result,
          the client connected to 55 sees different data.

          • I understand the fix proposed is to perform a null transaction after
            leader election. I can work on it, but I am not sure why this is considered as a
            complex fix. Mahadev mentioned earlier that the bug fix (for the
            current ZAB version) will break backward compatibility. Can someone
            explain why?
          • In the meantime, as a workaround, we are forcing our client to do a dummy transaction
            after leader election. We have a few clients so I think this workaround
            will work without much overhead.

          Comments/suggestions?

          Thanks.

          Show
          Vishal Kher added a comment - Hi, I think we are running into this bug where the leader dies after recording the transaction locally but before sending it out to the followers. This is causing log divergence, which is causing the client connected to the old leader (after reboot) to see different data. Unfortunately, I don't have the transaction logs to look into. But based on the ZooKeeper logs I am strongly suspecting that we are running into this bug. I have given the description below. Please let me know if you think I am on the wrong track. We have a 3 node cluster. IP addresses are *.53, *.54 and *.55. 1) 54 is the leader and is rebooted by our test. 2) Node 55 becomes ZK leader 2011-05-30 22:56:45.797 INFO - [QuorumPeer [myid=2] /0:0:0:0:0:0:0:0:2181:QuorumPeer@662] - LEADING The next transaction it will do will be zxid 0x0x5D00000001. We can see this from the zxid below: 2011-05-30 22:56:46.819 INFO - [LearnerHandler [myid=2] -/192.168.4.1:33329:Leader@503] - Have quorum of supporters; starting up and setting last processed zxid: 399431958528 3) Our client on 55 attempts to create a znode. This is the first transaction after 55 becomes the leader. The following application message tells us that is writing to ZK. 2011-05-30 22:56:52.608 INFO - commitTransaction - Committing transaction 21068: This will translate to ZooKeeper transaction ID 0x5D00000001. Now, at the same instant 53 was rebooted by the test. Now 54 and 53 are both down, 55 loses quorum and is rebooted by our watchdog. 2011-05-30 22:56:54.922 WARN - [LearnerHandler [myid=2] -/192.168.4.1:33329:LearnerHandler@475] - ******* GOODBYE /192.168.4.1:33329 ******** 2011-05-30 22:56:55.916 INFO - [QuorumPeer [myid=2] /0:0:0:0:0:0:0:0:2181:Leader@396] - Shutdown called java.lang.Exception: shutdown Leader! reason: Only 0 followers, need 1 At this step, I will assume that 53 logged transaction ID 0x5D00000001 and 53 went offline before receiving this transaction from 55. I am not able to verify this since I don't have the transaction logs. But I think this assumption is valid and can be inferred from the description below. 4) Now, 53 and 54 reboot. 53 becomes leader. Since neither 53 and 54 have heard of 55's 0x5D00000001 transaction, 53 also decides to start doing the transaction from 0x5D00000001. Corresponding log entry from 53: 2011-05-30 22:58:10.116 INFO - [LearnerHandler [myid=1] -/192.168.4.1:52419:Leader@503] - Have quorum of supporters; starting up and setting last processed zxid: 399431958528 This is where 55 and rest of the cluster will differ in their states. Same zxid but different content. 5) 55 reboots and rejoins the cluster. It tells 54 that it has transaction 0x5D00000001 logged. 54 sends it a diff of transactions starting from 0x5D00000002. We can see that from the following entry in 55's log: 2011-05-30 22:58:24.677 WARN - [QuorumPeer [myid=2] /0:0:0:0:0:0:0:0:2181:Learner@341] - Got zxid 0x5d00000002 expected 0x1 Thus, old transaction 0x5D00000001 remains in 55 logs. As a result, the client connected to 55 sees different data. I understand the fix proposed is to perform a null transaction after leader election. I can work on it, but I am not sure why this is considered as a complex fix. Mahadev mentioned earlier that the bug fix (for the current ZAB version) will break backward compatibility. Can someone explain why? In the meantime, as a workaround, we are forcing our client to do a dummy transaction after leader election. We have a few clients so I think this workaround will work without much overhead. Comments/suggestions? Thanks.
          Hide
          Benjamin Reed added a comment -

          yes this is the scenario that creates the issue. note that step 5 has to occur soon after step 4 otherwise 55 will get a state transfer rather than a diff and the problem will be fixed.

          the fix isn't really performing a null transaction. (it is a work around, but you also need to get the timing right.) the problem is actually two parts: 1) the new epoch isn't getting logged properly. 2) once we do log properly we also need to make sure that we distinguish epochs that we tried to join from epochs that we did join. see https://cwiki.apache.org/confluence/display/ZOOKEEPER/Zab1.0 for the details.

          fixing it is rather easy, it is making sure that it is backwards compatible that is a bit tricky. i should have it done today hopefully. well at least a first cut. the problem is that method is rather complicated and adding in the backwards compatibility makes it more so. i'm going to avoid refactoring for the first cut so that it is clear what i have modified. note that this affects both the leading code and the leader election code.

          Show
          Benjamin Reed added a comment - yes this is the scenario that creates the issue. note that step 5 has to occur soon after step 4 otherwise 55 will get a state transfer rather than a diff and the problem will be fixed. the fix isn't really performing a null transaction. (it is a work around, but you also need to get the timing right.) the problem is actually two parts: 1) the new epoch isn't getting logged properly. 2) once we do log properly we also need to make sure that we distinguish epochs that we tried to join from epochs that we did join. see https://cwiki.apache.org/confluence/display/ZOOKEEPER/Zab1.0 for the details. fixing it is rather easy, it is making sure that it is backwards compatible that is a bit tricky. i should have it done today hopefully. well at least a first cut. the problem is that method is rather complicated and adding in the backwards compatibility makes it more so. i'm going to avoid refactoring for the first cut so that it is clear what i have modified. note that this affects both the leading code and the leader election code.
          Hide
          Camille Fournier added a comment -

          Was there an email or something about this? Could we get a summary of the current state of correctness vis a vis the existing implementation? It makes me a bit nervous seeing this come in seemingly out of the blue, considering the criticality of the production systems I'm running on top of "pre"Zab1.0.

          Show
          Camille Fournier added a comment - Was there an email or something about this? Could we get a summary of the current state of correctness vis a vis the existing implementation? It makes me a bit nervous seeing this come in seemingly out of the blue, considering the criticality of the production systems I'm running on top of "pre"Zab1.0.
          Hide
          Vishal Kher added a comment -

          note that step 5 has to occur soon after step 4 otherwise 55 will get a state transfer rather than a diff and the problem will be fixed.

          Yes, I realized that. In our case, the new leader is sending diffs to the faulty node.

          the fix isn't really performing a null transaction. (it is a work around, but you also need to get the timing right.)

          It will be a workaround but it will be useful for pre zab 1.0 users. Why do you think we need to get the timing right? We just need to ensure that the leader performs the transaction before accepting any transactions from clients.

          Do you agree that my workaround from the client side will work? (another workaround is to always send SNAP instead of DIFF).

          Show
          Vishal Kher added a comment - note that step 5 has to occur soon after step 4 otherwise 55 will get a state transfer rather than a diff and the problem will be fixed. Yes, I realized that. In our case, the new leader is sending diffs to the faulty node. the fix isn't really performing a null transaction. (it is a work around, but you also need to get the timing right.) It will be a workaround but it will be useful for pre zab 1.0 users. Why do you think we need to get the timing right? We just need to ensure that the leader performs the transaction before accepting any transactions from clients. Do you agree that my workaround from the client side will work? (another workaround is to always send SNAP instead of DIFF).
          Show
          Vishal Kher added a comment - Was there an email or something about this? http://zookeeper-user.578899.n2.nabble.com/Divergence-in-ZK-transaction-logs-in-some-corner-cases-td2547596.html
          Hide
          Camille Fournier added a comment -

          What year is it now...?
          Would be nice if we got somewhat more regular pings about this stuff, for the newer joiners.

          Show
          Camille Fournier added a comment - What year is it now...? Would be nice if we got somewhat more regular pings about this stuff, for the newer joiners.
          Hide
          Benjamin Reed added a comment -

          vishal, i'm not sure if i understand your proposal correctly. when i originally read it i thought you meant to have a client make sure to do an innocuous first transaction, but you would run into timing issues there because another client might be first.

          your comment got me thinking about another way, which you were probably proposing all along: if the leader issues the first innocuous transaction, and makes sure it is committed before it processes any more it might be easier than doing things at the quorum packet layer. i'm going to see if i can map it to the flow in the tech report. it would help with the backwards compatibility.

          Show
          Benjamin Reed added a comment - vishal, i'm not sure if i understand your proposal correctly. when i originally read it i thought you meant to have a client make sure to do an innocuous first transaction, but you would run into timing issues there because another client might be first. your comment got me thinking about another way, which you were probably proposing all along: if the leader issues the first innocuous transaction, and makes sure it is committed before it processes any more it might be easier than doing things at the quorum packet layer. i'm going to see if i can map it to the flow in the tech report. it would help with the backwards compatibility.
          Hide
          Vishal Kher added a comment -

          vishal, i'm not sure if i understand your proposal correctly. when i originally read it i thought you meant to have a client make sure to do an innocuous first transaction, but you would run into timing issues there because another client might be first.

          You are right about the timing issue. To avoid another client from
          doing a transaction I do the null transactions from all clients
          Not efficient, but avoids this race. However, while testing this
          client side workaround I ran into another problem. Our application
          uses watchdogs and reboots the machines when zookeeper is disconnect
          for long periods of time. As a result, the client cannot guarantee
          that the first transaction to be logged will be always the null
          transaction. It could be a createSession. So two nodes can still have
          log divergence. If a node misses createSession from its transaction
          log, then it will not expire it. doh! should have realized this. Now,
          I can go a step ahead and make sure that the first session creation is
          innocuous. This will work, but does not sound good. I will probably
          give up on my idea for client side work around.

          your comment got me thinking about another way, which you were probably proposing all along: if the leader issues the first innocuous transaction, and makes sure it is committed before it processes any more it might be easier than doing things at the quorum packet layer. i'm going to see if i can map it to the flow in the tech report. it would help with the backwards compatibility.

          Yes, that was what I was proposing (and hence confused about backward compatibility).

          Show
          Vishal Kher added a comment - vishal, i'm not sure if i understand your proposal correctly. when i originally read it i thought you meant to have a client make sure to do an innocuous first transaction, but you would run into timing issues there because another client might be first. You are right about the timing issue. To avoid another client from doing a transaction I do the null transactions from all clients Not efficient, but avoids this race. However, while testing this client side workaround I ran into another problem. Our application uses watchdogs and reboots the machines when zookeeper is disconnect for long periods of time. As a result, the client cannot guarantee that the first transaction to be logged will be always the null transaction. It could be a createSession. So two nodes can still have log divergence. If a node misses createSession from its transaction log, then it will not expire it. doh! should have realized this. Now, I can go a step ahead and make sure that the first session creation is innocuous. This will work, but does not sound good. I will probably give up on my idea for client side work around. your comment got me thinking about another way, which you were probably proposing all along: if the leader issues the first innocuous transaction, and makes sure it is committed before it processes any more it might be easier than doing things at the quorum packet layer. i'm going to see if i can map it to the flow in the tech report. it would help with the backwards compatibility. Yes, that was what I was proposing (and hence confused about backward compatibility).
          Hide
          Benjamin Reed added a comment -

          are you able to reproduce the problem reliably? i think my test case does it, but i'm curious if it fixes your problem.

          Show
          Benjamin Reed added a comment - are you able to reproduce the problem reliably? i think my test case does it, but i'm curious if it fixes your problem.
          Hide
          Vishal Kher added a comment -

          I am able to repro the bug reliably.

          but i'm curious if it fixes your problem.

          What are you referring to? The client side does fix the problem for us, but it is a bit ugly fix.

          Show
          Vishal Kher added a comment - I am able to repro the bug reliably. but i'm curious if it fixes your problem. What are you referring to? The client side does fix the problem for us, but it is a bit ugly fix.
          Hide
          Benjamin Reed added a comment -

          fixed the FLE tests.

          right now the hierarchal quorum tests are failing. flavio can you see if you notice anything?

          Show
          Benjamin Reed added a comment - fixed the FLE tests. right now the hierarchal quorum tests are failing. flavio can you see if you notice anything?
          Hide
          Vishal Kher added a comment -

          Hi Ben, Flavio,

          Thanks. We will try out the patch.

          For the record, after some more discussion with Ben I realized that none of the workarounds that I was trying would work. Fixing this bug would need change in the protocol. Thanks for the clarification..

          Show
          Vishal Kher added a comment - Hi Ben, Flavio, Thanks. We will try out the patch. For the record, after some more discussion with Ben I realized that none of the workarounds that I was trying would work. Fixing this bug would need change in the protocol. Thanks for the clarification..
          Hide
          Benjamin Reed added a comment -

          this has a small fix in it. still working on two more tests that aren't passing. i think it might be a problem in the tests.

          Show
          Benjamin Reed added a comment - this has a small fix in it. still working on two more tests that aren't passing. i think it might be a problem in the tests.
          Hide
          Benjamin Reed added a comment -

          i think this is it. after i make sure the tests pass. i'll upload it to review.apache.org.

          vishal and flavio can you check it out?

          Show
          Benjamin Reed added a comment - i think this is it. after i make sure the tests pass. i'll upload it to review.apache.org. vishal and flavio can you check it out?
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12481761/ZOOKEEPER-335_4.patch
          against trunk revision 1132489.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 13 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/308//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/308//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/308//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12481761/ZOOKEEPER-335_4.patch against trunk revision 1132489. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 13 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/308//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/308//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/308//console This message is automatically generated.
          Hide
          Patrick Hunt added a comment -

          Findbugs has a single issue that needs to be addressed:

          HE_EQUALS_USE_HASHCODE: Class defines equals() and uses Object.hashCode()

          Show
          Patrick Hunt added a comment - Findbugs has a single issue that needs to be addressed: HE_EQUALS_USE_HASHCODE: Class defines equals() and uses Object.hashCode()
          Hide
          Benjamin Reed added a comment -

          addressed flavio's review and the findbug issue.

          Show
          Benjamin Reed added a comment - addressed flavio's review and the findbug issue.
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/877/
          -----------------------------------------------------------

          Review request for fpj and Vishal Kher.

          Summary
          -------

          zookeeper servers should commit the new leader txn to their logs.

          This addresses bug ZOOKEEPER-335.
          https://issues.apache.org/jira/browse/ZOOKEEPER-335

          Diffs


          http://svn.apache.org/repos/asf/zookeeper/trunk/src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java 1134016
          http://svn.apache.org/repos/asf/zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/FastLeaderElection.java 1134016
          http://svn.apache.org/repos/asf/zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/Follower.java 1134016
          http://svn.apache.org/repos/asf/zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/Leader.java 1134016
          http://svn.apache.org/repos/asf/zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/Learner.java 1134016
          http://svn.apache.org/repos/asf/zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/LearnerHandler.java 1134016
          http://svn.apache.org/repos/asf/zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/QuorumPeer.java 1134016
          http://svn.apache.org/repos/asf/zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/StateSummary.java PRE-CREATION
          http://svn.apache.org/repos/asf/zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/Vote.java 1134016
          http://svn.apache.org/repos/asf/zookeeper/trunk/src/java/main/org/apache/zookeeper/server/util/ZxidUtils.java PRE-CREATION
          http://svn.apache.org/repos/asf/zookeeper/trunk/src/java/test/org/apache/zookeeper/server/quorum/QuorumPeerMainTest.java 1134016
          http://svn.apache.org/repos/asf/zookeeper/trunk/src/java/test/org/apache/zookeeper/server/quorum/QuorumPeerTestBase.java 1134016
          http://svn.apache.org/repos/asf/zookeeper/trunk/src/java/test/org/apache/zookeeper/test/ClientBase.java 1134016
          http://svn.apache.org/repos/asf/zookeeper/trunk/src/java/test/org/apache/zookeeper/test/ReadOnlyModeTest.java 1134016
          http://svn.apache.org/repos/asf/zookeeper/trunk/src/zookeeper.jute 1134016

          Diff: https://reviews.apache.org/r/877/diff

          Testing
          -------

          Thanks,

          Benjamin

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/877/ ----------------------------------------------------------- Review request for fpj and Vishal Kher. Summary ------- zookeeper servers should commit the new leader txn to their logs. This addresses bug ZOOKEEPER-335 . https://issues.apache.org/jira/browse/ZOOKEEPER-335 Diffs http://svn.apache.org/repos/asf/zookeeper/trunk/src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java 1134016 http://svn.apache.org/repos/asf/zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/FastLeaderElection.java 1134016 http://svn.apache.org/repos/asf/zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/Follower.java 1134016 http://svn.apache.org/repos/asf/zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/Leader.java 1134016 http://svn.apache.org/repos/asf/zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/Learner.java 1134016 http://svn.apache.org/repos/asf/zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/LearnerHandler.java 1134016 http://svn.apache.org/repos/asf/zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/QuorumPeer.java 1134016 http://svn.apache.org/repos/asf/zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/StateSummary.java PRE-CREATION http://svn.apache.org/repos/asf/zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/Vote.java 1134016 http://svn.apache.org/repos/asf/zookeeper/trunk/src/java/main/org/apache/zookeeper/server/util/ZxidUtils.java PRE-CREATION http://svn.apache.org/repos/asf/zookeeper/trunk/src/java/test/org/apache/zookeeper/server/quorum/QuorumPeerMainTest.java 1134016 http://svn.apache.org/repos/asf/zookeeper/trunk/src/java/test/org/apache/zookeeper/server/quorum/QuorumPeerTestBase.java 1134016 http://svn.apache.org/repos/asf/zookeeper/trunk/src/java/test/org/apache/zookeeper/test/ClientBase.java 1134016 http://svn.apache.org/repos/asf/zookeeper/trunk/src/java/test/org/apache/zookeeper/test/ReadOnlyModeTest.java 1134016 http://svn.apache.org/repos/asf/zookeeper/trunk/src/zookeeper.jute 1134016 Diff: https://reviews.apache.org/r/877/diff Testing ------- Thanks, Benjamin
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12481990/ZOOKEEPER-335_5.patch
          against trunk revision 1132489.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 13 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/311//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/311//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/311//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12481990/ZOOKEEPER-335_5.patch against trunk revision 1132489. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 13 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/311//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/311//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/311//console This message is automatically generated.
          Hide
          Flavio Junqueira added a comment -

          +1, I have reviewed the patch twice and I don't have further issues. Also, I believe the patch I provided in ZOOKEEPER-1082 has been correctly incorporated.

          Show
          Flavio Junqueira added a comment - +1, I have reviewed the patch twice and I don't have further issues. Also, I believe the patch I provided in ZOOKEEPER-1082 has been correctly incorporated.
          Hide
          Benjamin Reed added a comment -

          ok if there are no objections, i'd like to get this in tonight. i plan on just applying to trunk since the change is rather deep. given that we have never seen this bug in production it's probably better to not risk the stable branch.

          Show
          Benjamin Reed added a comment - ok if there are no objections, i'd like to get this in tonight. i plan on just applying to trunk since the change is rather deep. given that we have never seen this bug in production it's probably better to not risk the stable branch.
          Hide
          Benjamin Reed added a comment -

          Committed revision 1135382.

          Show
          Benjamin Reed added a comment - Committed revision 1135382.
          Hide
          Hudson added a comment -

          Integrated in ZooKeeper-trunk #1215 (See https://builds.apache.org/job/ZooKeeper-trunk/1215/)

          Show
          Hudson added a comment - Integrated in ZooKeeper-trunk #1215 (See https://builds.apache.org/job/ZooKeeper-trunk/1215/ )

            People

            • Assignee:
              Benjamin Reed
              Reporter:
              Mahadev konar
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development