Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Duplicate
-
None
-
None
Description
We occasionally hit an issue where our Flink cluster will not startup if any of the zookeeper hosts passed in the "high-availability.zookeeper.quorum" config setting are unreachable. This seems to stem from us using an older zookeeper dependency version (3.4.10).
Sample error we see is shown below.
This error seems to stem from us being on an older zookeeper release (3.4.10). This has been fixed as part of: https://issues.apache.org/jira/browse/ZOOKEEPER-1576 in the 3.4.x branch (https://github.com/apache/zookeeper/commit/be1409cc9a14ac2e28693e0e02a0ba6d9713565e).
java.net.UnknownHostException: zk01-pa4.hpc.criteo.prod: Name or service not knownjava.net.UnknownHostException: zk01-pa4.hpc.criteo.prod: Name or service not known at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method) at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:929) at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1324) at java.net.InetAddress.getAllByName0(InetAddress.java:1277) at java.net.InetAddress.getAllByName(InetAddress.java:1193) at java.net.InetAddress.getAllByName(InetAddress.java:1127) at org.apache.flink.shaded.zookeeper.org.apache.zookeeper.client.StaticHostProvider.<init>(StaticHostProvider.java:61) at org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:445) at org.apache.flink.shaded.curator.org.apache.curator.utils.DefaultZookeeperFactory.newZooKeeper(DefaultZookeeperFactory.java:29) at org.apache.flink.shaded.curator.org.apache.curator.framework.imps.CuratorFrameworkImpl$2.newZooKeeper(CuratorFrameworkImpl.java:150) at org.apache.flink.shaded.curator.org.apache.curator.HandleHolder$1.getZooKeeper(HandleHolder.java:94) at org.apache.flink.shaded.curator.org.apache.curator.HandleHolder.getZooKeeper(HandleHolder.java:55) at org.apache.flink.shaded.curator.org.apache.curator.ConnectionState.reset(ConnectionState.java:262) at org.apache.flink.shaded.curator.org.apache.curator.ConnectionState.start(ConnectionState.java:109) at org.apache.flink.shaded.curator.org.apache.curator.CuratorZookeeperClient.start(CuratorZookeeperClient.java:191) at org.apache.flink.shaded.curator.org.apache.curator.framework.imps.CuratorFrameworkImpl.start(CuratorFrameworkImpl.java:259) at org.apache.flink.runtime.util.ZooKeeperUtils.startCuratorFramework(ZooKeeperUtils.java:131) at org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createHighAvailabilityServices(HighAvailabilityServicesUtils.java:123) at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.createHaServices(ClusterEntrypoint.java:292) at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.initializeServices(ClusterEntrypoint.java:257)
Attachments
Issue Links
- is superceded by
-
FLINK-16955 Bump Zookeeper 3.4.X to 3.4.14
- Closed
- links to