Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
As of now, we can only have a single deltastreamer write to a single hudi table. we have an ask from the community to have 2 deltastreamers write to a single table.
Things required to be fixed:
- we need to fix the checkpointing to have multiple key-value pairs, where key represents a unique identifier for the deltastreamer client and value represents the checkpoint. We might need to introduce a new notion of identifier for each deltastreamer in this case.
- within delta sync, after writeClient.upsert, before calling writeClient.commit, we need to update the checkpoint value. for this, we might need to take a lock and then fetch latest checkpoint from timeline (since there could be multiple wirters) and then update the checkpoint. and release the lock.
These are the changes I can think of. may be while implementing it, there could be some more minor fixes required.
ask from a user: https://github.com/apache/hudi/issues/6718
Attachments
Issue Links
- links to