[ZOOKEEPER-1506] Re-try DNS hostname -> IP resolution if node connection fails - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Blocker
Resolution: Fixed
Affects Version/s: 3.4.5, 3.4.6
Fix Version/s: 3.4.7, 3.5.0, 3.6.0
Component/s: server
Labels:
- patch
Environment:

Ubuntu 11.04 64-bit

Release Note:

Hide
Tests pass with this patch.
This patch is for the branch-3.4 branch ONLY.

Show
Tests pass with this patch. This patch is for the branch-3.4 branch ONLY.

Description

In our zoo.cfg we use hostnames to identify the ZK servers that are part of an ensemble. These hostnames are configured with a low (<= 60s) TTL and the IP address they map to can and does change. Our procedure for replacing/upgrading a ZK node is to boot an entirely new instance and remap the hostname to the new instance's IP address. Our expectation is that when the original ZK node is terminated/shutdown, the remaining nodes in the ensemble would reconnect to the new instance.

However, what we are noticing is that the remaining ZK nodes do not attempt to re-resolve the hostname->IP mapping for the new server. Once the original ZK node is terminated, the existing servers continue to attempt contacting it at the old IP address. It would be great if the ZK servers could try to re-resolve the hostname when attempting to connect to a lost ZK server, instead of caching the lookup indefinitely. Currently we must do a rolling restart of the ZK ensemble after swapping a node – which at three nodes means we periodically lose quorum.

The exact method we are following is to boot new instances in EC2 and attach one, of a set of three, Elastic IP address. External to EC2 this IP address remains the same and maps to whatever instance it is attached to. Internal to EC2, the elastic IP hostname has a TTL of about 45-60 seconds and is remapped to the internal (10.x.y.z) address of the instance it is attached to. Therefore, in our case we would like ZK to pickup the new 10.x.y.z address that the elastic IP hostname gets mapped to and reconnect appropriately.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

zk-dns-caching-refresh.patch
02/May/13 18:18
7 kB
Michael Lasevich
Zookeeper-1506.patch
26/Aug/15 17:59
40 kB
Robert P. Thille
ZOOKEEPER-1506.patch
21/Sep/15 23:01
35 kB
Robert P. Thille
ZOOKEEPER-1506.patch
15/Sep/15 01:02
35 kB
Robert P. Thille
ZOOKEEPER-1506.patch
10/Sep/15 22:14
36 kB
Robert P. Thille
ZOOKEEPER-1506.patch
27/Aug/15 00:06
39 kB
Robert P. Thille
ZOOKEEPER-1506.patch
25/Mar/15 17:18
5 kB
Michi Mutsuzaki
ZOOKEEPER-1506.patch
25/Mar/15 07:33
7 kB
Michi Mutsuzaki
ZOOKEEPER-1506.patch
15/Mar/15 07:39
6 kB
Michi Mutsuzaki
ZOOKEEPER-1506.patch
20/Oct/14 03:12
6 kB
Michi Mutsuzaki
ZOOKEEPER-1506.patch
07/Jun/14 19:38
3 kB
Michi Mutsuzaki
ZOOKEEPER-1506.patch
01/May/14 05:07
3 kB
Michi Mutsuzaki
ZOOKEEPER-1506.patch
30/Apr/14 22:58
2 kB
Michi Mutsuzaki
ZOOKEEPER-1506-fix.patch
20/Apr/15 21:33
0.7 kB
Michi Mutsuzaki

Issue Links

contains

ZOOKEEPER-2319 UnresolvedAddressException cause the QuorumCnxManager.Listener exit

Resolved

duplicates

ZOOKEEPER-1846 Cached InetSocketAddresses prevent proper dynamic DNS resolution

Resolved

is related to

ZOOKEEPER-2982 Re-try DNS hostname -> IP resolution

Resolved

ZOOKEEPER-2184 Zookeeper Client should re-resolve hosts when connection attempts fail

Closed

Activity

People

Assignee:: Robert P. Thille

Reporter:: Mike Heffner

Votes:: 29 Vote for this issue

Watchers:: 45 Start watching this issue

Dates

Created:: 06/Jul/12 16:23

Updated:: 18/Jun/18 04:05

Resolved:: 23/Sep/15 17:19