[FLINK-26284] The ZooKeeperStateHandleStore cleans the metadata before cleaning the StateHandle - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Blocker
Resolution: Fixed
Affects Version/s: 1.15.0
Fix Version/s: 1.15.0
Component/s: Runtime / Coordination
Labels:
- pull-request-available

Description

Cleanup of job state does not work properly in an HA setup. releaseAndTryRemove deletes the meta data stored in the store before cleaning up the StateHandle. If the StateHandle cleanup fails after the reference is already deleted in the StateHandleStore, a cleanup retry will constantly fail because it cannot deserialize the StateHandle anymore.

Attachments

Issue Links

causes

FLINK-26987 ZooKeeperStateHandleStore.getAllAndLock ends up in a infinite loop if there's an entry marked for deletion that's not cleaned up, yet

Resolved

is blocked by

FLINK-26285 ZooKeeperStateHandleStore does not handle not existing nodes properly in getAllAndLock

Resolved

is caused by

FLINK-25432 Introduce common interfaces for cleaning up local and global job data

Resolved

is cloned by

FLINK-26286 The KubernetesStateHandleStore cleans the metadata before cleaning the StateHandle

Resolved

Testing discovered

FLINK-26285 ZooKeeperStateHandleStore does not handle not existing nodes properly in getAllAndLock

Resolved

FLINK-26288 Investigate why RetrievableStreamStateHandle.close does not call the close method of the wrapped instance

Open

links to

GitHub Pull Request #18869

(1 Testing discovered, 1 links to)

Activity

People

Assignee:: Matthias Pohl

Reporter:: Matthias Pohl

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 21/Feb/22 14:27

Updated:: 01/Apr/22 12:59

Resolved:: 28/Feb/22 10:35