Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-6709

Multi Table Transactions

    XMLWordPrintableJSON

Details

    • Epic
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • 1.0.0
    • None
    • None
    • hudi-multi-txn

    Description

      Strawman idea: 

      • Introduce a notion of a "database" into Hudi's core (think of it, analogous to a database server), with its own timeline. 
      • We introduce a parent-child relationship between the table's timeline and the database timeline i.e an action is complete only if its completed in both timelines (similar to data <=> metadata table sync today; although we can't reuse that)
      • A multi table transaction will first create the action on the database timeline, then perform actions on individual tables, then finally complete it on the database timeline. 

      Open items: 

      • Need to formalize the design with considerations around isolation levels, nested queries, self joins, avoid phantom reads. 
      • Need to layout how we can deliver this via API and SQL (Spark for now) 
      • How this interplays with multi-writer scenarios and async table services. 

      Attachments

        Issue Links

          Activity

            People

              codope Sagar Sumit
              vinoth Vinoth Chandar
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: