Details
-
Bug
-
Status: Resolved
-
Blocker
-
Resolution: Fixed
-
1.15.0
Description
Cleanup of job state does not work properly in an HA setup. releaseAndTryRemove deletes the meta data stored in the store before cleaning up the StateHandle. If the StateHandle cleanup fails after the reference is already deleted in the StateHandleStore, a cleanup retry will constantly fail because it cannot deserialize the StateHandle anymore.
Attachments
Issue Links
- causes
-
FLINK-26987 ZooKeeperStateHandleStore.getAllAndLock ends up in a infinite loop if there's an entry marked for deletion that's not cleaned up, yet
- Resolved
- is blocked by
-
FLINK-26285 ZooKeeperStateHandleStore does not handle not existing nodes properly in getAllAndLock
- Resolved
- is caused by
-
FLINK-25432 Introduce common interfaces for cleaning up local and global job data
- Resolved
- is cloned by
-
FLINK-26286 The KubernetesStateHandleStore cleans the metadata before cleaning the StateHandle
- Resolved
- Testing discovered
-
FLINK-26285 ZooKeeperStateHandleStore does not handle not existing nodes properly in getAllAndLock
- Resolved
-
FLINK-26288 Investigate why RetrievableStreamStateHandle.close does not call the close method of the wrapped instance
- Open
- links to