Description
When a 3.3 server attempts to join an existing quorum lead by a 3.4 server, the 3.3 server is disconnected while trying to download the leader's snapshot. The 3.3 server restarts and starts the process over again, but is never able to join the quorum.
3.3 server log:
2012-12-07 10:44:34,582 - INFO [QuorumPeer:/0:0:0:0:0:0:0:0:2183:Learner@294] - Getting a snapshot from leader 2012-12-07 10:44:34,582 - INFO [QuorumPeer:/0:0:0:0:0:0:0:0:2183:Learner@325] - Setting leader epoch 12 2012-12-07 10:44:54,604 - WARN [QuorumPeer:/0:0:0:0:0:0:0:0:2183:Follower@82] - Exception when following the leader java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:392) at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63) at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:84) at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108) at org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:148) at org.apache.zookeeper.server.quorum.Learner.syncWithLeader(Learner.java:332) at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:75) at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:645) 2012-12-07 10:44:54,605 - INFO [QuorumPeer:/0:0:0:0:0:0:0:0:2183:Follower@165] - shutdown called java.lang.Exception: shutdown Follower at org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:165) at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:649)
3.4 leader log:
2012-12-07 10:51:35,178 [myid:2] - INFO [WorkerReceiver[myid=2]:FastLeaderElection$Messenger$WorkerReceiver@273] - Backward compatibility mode, server id=3
2012-12-07 10:51:35,178 [myid:2] - INFO [WorkerReceiver[myid=2]:FastLeaderElection@542] - Notification: 3 (n.leader), 0x1100000000 (n.zxid), 0x2 (n.round), LOOKING (n.state), 3 (n.sid), 0x11 (n.peerEPoch), LEADING (my state)
2012-12-07 10:51:35,182 [myid:2] - INFO [LearnerHandler-/127.0.0.1:37654:LearnerHandler@263] - Follower sid: 3 : info : org.apache.zookeeper.server.quorum.QuorumPeer$QuorumServer@262f4873
2012-12-07 10:51:35,182 [myid:2] - INFO [LearnerHandler-/127.0.0.1:37654:LearnerHandler@318] - Synchronizing with Follower sid: 3 maxCommittedLog=0x0 minCommittedLog=0x0 peerLastZxid=0x1100000000
2012-12-07 10:51:35,182 [myid:2] - INFO [LearnerHandler-/127.0.0.1:37654:LearnerHandler@395] - Sending SNAP
2012-12-07 10:51:35,183 [myid:2] - INFO [LearnerHandler-/127.0.0.1:37654:LearnerHandler@419] - Sending snapshot last zxid of peer is 0x1100000000 zxid of leader is 0x1200000000sent zxid of db as 0x1200000000
2012-12-07 10:51:55,204 [myid:2] - ERROR [LearnerHandler-/127.0.0.1:37654:LearnerHandler@562] - Unexpected exception causing shutdown while sock still open
java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:150)
at java.net.SocketInputStream.read(SocketInputStream.java:121)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
at java.io.DataInputStream.readInt(DataInputStream.java:387)
at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)
at org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:450)
2012-12-07 10:51:55,205 [myid:2] - WARN [LearnerHandler-/127.0.0.1:37654:LearnerHandler@575] - ******* GOODBYE /127.0.0.1:37654 ********