Details
-
Bug
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
None
-
None
-
None
-
None
Description
We observed a weird, random issue trying to create zookeeper client connections on osx. Sometimes it would work and sometimes it would fail. Also it is randomly very slow. It turns out both issues have the same cause.
My hosts file on osx (which is an unmodified default one), lists three entries for localhost:
127.0.0.1 localhost
::1 localhost
fe80::1%lo0 localhost
We saw zookeeper trying to connect to fe80:0:0:0:0:0:0:1%1 sometimes, which is not listed (actually one in four times, it seems to round robin over the addresses).
Whenever that happens, it sometimes works and sometimes fails. In both cases it's very slow. Reason: the reverse lookup for fe80:0:0:0:0:0:0:1%1 can't be resolved using the hosts file and it falls back to actually using the dns. Sometimes it actually works but other times it fails/times out after about 5 seconds. Probably a platform specific settings with dns setup hide this problem on linux.
As a workaround, we preresolve localhost now: Inet4Address.getByName("localhost"). This always resolves to 127.0.0.1 on my machine and works fast.
This fixes the issue for us. We're not sure where the fe80:0:0:0:0:0:0:1%1 address comes from though. I don't recall having this issue with other server side software so this might be a mix of platform setup, osx specific defaults, and zookeeper behavior.
I've seen one ticket that relates to ipv6 in zookeeper that might be related: ZOOKEEPER-667. Perhaps the workaround for that ticket introduced this problem?
Attachments
Issue Links
- is related to
-
ZOOKEEPER-1661 Random (?) 5s delay when establishing connection
- Open
-
ZOOKEEPER-1954 StaticHostProvider loses IPv6 scope ID when resolving server addresses
- In Progress