Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
-
Linux, ZK trunk
Description
When I was testing the patch for https://issues.apache.org/jira/browse/ZOOKEEPER-1691, the test included with that patch was failing for me. The problem happened when the tests shuts down some followers and then attempts to bring them back up:
2013-12-13 17:31:03,976 [myid:1] - INFO [QuorumPeer[myid=1]/127.0.0.1:11227:Follower@194] - shutdown called
java.lang.Exception: shutdown Follower
at org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:194)
at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:971)
...
2013-12-13 17:31:03,992 [myid:1] - INFO [QuorumPeerListener:QuorumCnxManager$Listener@544] - My election bind port: localhost/127.0.0.1:11229
2013-12-13 17:31:03,992 [myid:1] - ERROR [localhost/127.0.0.1:11229:QuorumCnxManager$Listener@557] - Exception while listening
java.net.BindException: Address already in use
at java.net.PlainSocketImpl.socketBind(Native Method)
at java.net.AbstractPlainSocketImpl.bind(AbstractPlainSocketImpl.java:376)
at java.net.ServerSocket.bind(ServerSocket.java:376)
at java.net.ServerSocket.bind(ServerSocket.java:330)
at org.apache.zookeeper.server.quorum.QuorumCnxManager$Listener.run(QuorumCnxManager.java:546)
The problem appears to be that the when follower.shutdown() is called in QuorumPeer.run(), the election algorithm is never shut down, so when the node restarts it can't bind back to the same port.
I will upload a patch that calls shutdown() for both the leader and the follower in this case, but I'm not positive its the right place or fix for this issue, so feedback would be appreciated.