[ZOOKEEPER-4837] Network issue causes ephemeral node unremoved after the session expiration - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Critical
Resolution: Unresolved
Affects Version/s: 3.9.2
Fix Version/s: None
Component/s: quorum, server
Labels:
- pull-request-available

Description

In our testing cluster with the latest ZooKeeper version (66202cb), we observed that sometimes an ephemeral node never gets deleted if there is a network issue during the PROPOSAL request, even after the session expires. This bug is essentially related to ~~ZOOKEEPER-2355~~, but the issue was not entirely fixed in the previous patch. We also tested on some related open PRs (e.g., https://github.com/apache/zookeeper/pull/2152 and https://github.com/apache/zookeeper/pull/1925 ), and confirmed the issue exists after the proposed fix.

Steps to reproduce this bug:

Start a cluster with 3 servers, follower A, leader B, follower C
Open a zk client in server A
Create an ephemeral node in the client
Inject network issue in all server that causes SocketTimeoutException during readPacket if the packet is a PROPOSAL
Close the client
Wait until the cluster is stable (the leader will change between B and C several times)
Remove the network issue from all server
Check every server for ephemeral node existence. The ephemeral node will exist in server A. However, server B and C will not have the ephemeral node anymore.

Essentially the bug is caused by loadDatabase loading a snapshot with a higher Zxid than the truncated log, causing fastForwardFromEdits to fail when trying to replay the transactions. For example, if one of the follower has a lastProcessedZxid of 0x200000001 and last snapshot snapshot.200000001, and the leader sends a TRUNC with a zxid of 100000002, truncateLog will truncate the follower's log to 100000002. However, loadDatabase will load snapshot.200000001. So when fastForwardFromEdits happens, it will set the data tree to 200000001 instead of 100000002.

We also attached a test case to reproduce this issue. Note that this test case is still pretty flaky at this moment.

We proposed to fix this case by loading the database from the last snapshot that happens before the last truncated-log entry during truncateLog. See our PR attached. Of course, this may not be the ideal solution and we welcome suggestions. Some other potential solutions include:

(1) Disable fastForwardDatabase in shutdown

(2) Run setLastProcessedZxid at the end of Learner's syncWithLeader function if the packet is Leader.DIFF

Your insights are very much appreciated. We will continue following up this issue until it is resolved.

Attachments

Issue Links

links to

GitHub Pull Request #2172

Activity

People

Assignee:: Unassigned

Reporter:: Dimas Shidqi Parikesit

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 05/Jun/24 16:45

Updated:: 28/Nov/24 16:25

Time Tracking

Estimated:

Not Specified

Remaining:

Logged:

10m