Details
-
Bug
-
Status: Open
-
Critical
-
Resolution: Unresolved
-
2.11.0
-
None
-
None
Description
ZK is 3.5.1-alpha
I have a 3 nodes ZK cluster , readonly mode is enabled.
2 nodes are down, so one of them (QA-E8WIN11) is in read-only (verified by using the ZK API manually). All the machines of the ensemble can be pinged from the client.
I'm using this piece of code:
Builder curatorClientBuilder = CuratorFrameworkFactory.builder() .connectString("QA-E8WIN11:2181,QA-E8WIN12:2181") .sessionTimeoutMs(45000).connectionTimeoutMs(15000) .retryPolicy(new RetryNTimes(3, 5000)).canBeReadOnly(true); CuratorFramework client = curatorClientBuilder.build(); client.start(); client.getZookeeperClient().blockUntilConnectedOrTimedOut(); System.out.println("Successfully established the connection with ZooKeeper"); client.getData().forPath("/"); System.out.println("Done.");
When curator pick the host that is UP first, it goes through very quickly. When it picks the host that is down first (QA-E8WIN12), it seems to be stuck at the getData() call for a very long time, and then eventually fail with a ConnectionLossException. (see attached log)