Details
-
Bug
-
Status: Closed
-
Critical
-
Resolution: Fixed
-
0.9
-
None
-
None
Description
Checkpoints that fail never send a commit message to the tasks.
Maintaining a map of all pending checkpoints introduces a memory leak, as entries for failed checkpoints will never be removed.
Approaches to fix this:
- The source cleans up entries from older checkpoints once a checkpoint is committed (simple implementation in a linked hash map)
- The commit message could include the optional state handle (source needs not maintain the map)
- The checkpoint coordinator could send messages for failed checkpoints?