Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
Description
As of now, we store checkpoints in commit metadata while writing via deltastreamer.
To support multiple writers (multiple deltastreamers), we added support sometime back where in the checkpoint will be a map to store multiple entries with key referring to writer identifier.
,
{ "writer2" = "checkpointVal2"} }
But this incurs some additional locking since everytime when new checkpoint needs to be updated, we have to reload the timeline and fetch the latest known commit metadata.
Instead we can de-couple the checkpoint.
Each writer only update its own checkpoint. and while parsing/fetching the latest known checkpoint for a writer, we might need to walk back in the timeline and find the right checkpoint value.
For eg:
commit1 by writer1: commit metadata ➝ {writer1 = checkpointVal1}
commit2 by writer2: commit metadata ➝ {writer2 = checkpointValA}
commit2 by writer1: To fetch latest checkpoint for writer1, we walk back the timeline and fetch the checkpoint of interest. So, even though latest commit metadata might have checkpoint, its key refers to writer2. And so we might need to go back and fetch the checkpoint from commit1.
and finally writer1 will update the commit metadata to {writer1 = checkpointVal2}
btw, Please check when was the multiple checkpoint support was added. if it was added before 0.13.0, we need to ensure its backwards compatible as well. if not, we are good. Just fixing the exiting solution would suffice.
Attachments
Issue Links
- links to