[ZOOKEEPER-1367] Data inconsistencies and unexpired ephemeral nodes after cluster restart - ASF JIRA

Log work

Agile Board

Rank to Top

Rank to Bottom

Attach files

Attach Screenshot

Bulk Copy Attachments

Bulk Move Attachments

Voters

Watch issue

Watchers

Create sub-task

Convert to sub-task

Move

Link

Clone

Labels

Update Comment Author

Replace String in Comment

Update Comment Visibility

Delete Comments

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Blocker
Resolution: Fixed
Affects Version/s: 3.4.2
Fix Version/s: 3.4.3, 3.3.5, 3.5.0
Component/s: server
Labels:
None
Environment:

Debian Squeeze, 64-bit

Hadoop Flags:

Reviewed
Release Note:
Fix Data inconsistencies and unexpired ephemeral nodes after cluster restart.

Description

In one of our tests, we have a cluster of three ZooKeeper servers. We kill all three, and then restart just two of them. Sometimes we notice that on one of the restarted servers, ephemeral nodes from previous sessions do not get deleted, while on the other server they do. We are effectively running 3.4.2, though technically we are running 3.4.1 with the patch manually applied for ~~ZOOKEEPER-1333~~ and a C client for 3.4.1 with the patches for ~~ZOOKEEPER-1163~~.

I noticed that when I connected using zkCli.sh to the first node (90.0.0.221, zkid 84), I saw only one znode in a particular path:

[zk: 90.0.0.221:2888(CONNECTED) 0] ls /election/zkrsm
[nominee0000000011]
[zk: 90.0.0.221:2888(CONNECTED) 1] get /election/zkrsm/nominee0000000011
90.0.0.222:7777
cZxid = 0x400000027
ctime = Thu Jan 19 08:18:24 UTC 2012
mZxid = 0x400000027
mtime = Thu Jan 19 08:18:24 UTC 2012
pZxid = 0x400000027
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0xa234f4f3bc220001
dataLength = 16
numChildren = 0

However, when I connect zkCli.sh to the second server (90.0.0.222, zkid 251), I saw three znodes under that same path:

[zk: 90.0.0.222:2888(CONNECTED) 2] ls /election/zkrsm
nominee0000000006 nominee0000000010 nominee0000000011
[zk: 90.0.0.222:2888(CONNECTED) 2] get /election/zkrsm/nominee0000000011
90.0.0.222:7777
cZxid = 0x400000027
ctime = Thu Jan 19 08:18:24 UTC 2012
mZxid = 0x400000027
mtime = Thu Jan 19 08:18:24 UTC 2012
pZxid = 0x400000027
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0xa234f4f3bc220001
dataLength = 16
numChildren = 0
[zk: 90.0.0.222:2888(CONNECTED) 3] get /election/zkrsm/nominee0000000010
90.0.0.221:7777
cZxid = 0x30000014c
ctime = Thu Jan 19 07:53:42 UTC 2012
mZxid = 0x30000014c
mtime = Thu Jan 19 07:53:42 UTC 2012
pZxid = 0x30000014c
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0xa234f4f3bc220000
dataLength = 16
numChildren = 0
[zk: 90.0.0.222:2888(CONNECTED) 4] get /election/zkrsm/nominee0000000006
90.0.0.223:7777
cZxid = 0x200000cab
ctime = Thu Jan 19 08:00:30 UTC 2012
mZxid = 0x200000cab
mtime = Thu Jan 19 08:00:30 UTC 2012
pZxid = 0x200000cab
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x5434f5074e040002
dataLength = 16
numChildren = 0

These never went away for the lifetime of the server, for any clients connected directly to that server. Note that this cluster is configured to have all three servers still, the third one being down (90.0.0.223, zkid 162).

I captured the data/snapshot directories for the the two live servers. When I start single-node servers using each directory, I can briefly see that the inconsistent data is present in those logs, though the ephemeral nodes seem to get (correctly) cleaned up pretty soon after I start the server.

I will upload a tar containing the debug logs and data directories from the failure. I think we can reproduce it regularly if you need more info.

Attachments

1367-3.3.patch
28/Jan/12 04:27
8 kB
Ted Yu
ZOOKEEPER-1367.patch
28/Jan/12 03:13
16 kB
Benjamin Reed
ZOOKEEPER-1367.patch
28/Jan/12 02:12
15 kB
Benjamin Reed
ZOOKEEPER-1367.tgz
20/Jan/12 18:51
5.06 MB
Jeremy Stribling
ZOOKEEPER-1367-3.3.patch
31/Jan/12 18:17
12 kB
Benjamin Reed
ZOOKEEPER-1367-3.4.patch
28/Jan/12 03:36
16 kB
Benjamin Reed

Activity

Comment

This comment will be Viewable by All Users Viewable by All Users

Cancel

People

Assignee:: Benjamin Reed Assign to me

Reporter:: Jeremy Stribling

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Dates

Created:: 20/Jan/12 18:48

Updated:: 28/Aug/13 22:20

Resolved:: 31/Jan/12 06:56

Agile

View on Board

Data inconsistencies and unexpired ephemeral nodes after cluster restart

Details

Description

Attachments

Attachments

Activity

People

Dates

Agile

Slack

Issue deployment