Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-6761

Fix rollbacks with MDT for MOR data table with log files

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • metadata
    • None

    Description

      There are few rollback scenarios, where some log files from data table could be missed to sync to MDT. Esply for cleaner purpose, every valid file from data table (which could be seen with fs.listStatus), should be synced to MDT. we can't afford to miss any log files. 

       

      Two major gaps which needs to be fixed. 

      1. log files from original commit being rolled back. 

      for eg, t5.dc fails mid-way in DT which added lf2. 

      we start a rollback commit t6.rb. when t6 syncs to MDT, we should also track lf2 and ensure we sync to MDT. 

      2. log files added by previous attempts of rollbacks. 

      in the above scenario, rollback could have added a log file (rollback command block) called lf3. 

      but if the rollback failed and is re-attempted, it could add another file called lf4. So, when this rollback syncs to MDT, we need to somehow ensure lf3 is also synced w/o a miss. 

       

      Attachments

        Activity

          People

            shivnarayan sivabalan narayanan
            shivnarayan sivabalan narayanan
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: