Details
-
Epic
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
-
hudi-multi-txn
Description
Strawman idea:
- Introduce a notion of a "database" into Hudi's core (think of it, analogous to a database server), with its own timeline.
- We introduce a parent-child relationship between the table's timeline and the database timeline i.e an action is complete only if its completed in both timelines (similar to data <=> metadata table sync today; although we can't reuse that)
- A multi table transaction will first create the action on the database timeline, then perform actions on individual tables, then finally complete it on the database timeline.
Open items:
- Need to formalize the design with considerations around isolation levels, nested queries, self joins, avoid phantom reads.
- Need to layout how we can deliver this via API and SQL (Spark for now)
- How this interplays with multi-writer scenarios and async table services.
Attachments
Issue Links
- mentioned in
-
Page Loading...