Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-1292 [Umbrella] RFC-15 : File Listing and Query Planning Optimizations
  3. HUDI-2468

Fix rollback of first commit after being synced to metadata table



    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 0.10.0
    • None


      lets say there is only one commit which got applied to metadata table as well.

      now user for some reason, wants to rollback this commit in data table. 

      So, when this reaches metadata code path, first we go through bootstrap code path. here we check last synced instant from metadata table and try to compare w/ data timeline. since the corresponding commit in datatimeline is inflight, code deduces that last synced instant is out of active timeline and need to be rebootstrapped. 

      but then, we have a condition that boostrapping can be done only if there are no inflight in data timeline. But the same very commit is actually inflight in datatime and we fail here. 


      This could also be an issue while trying to rollback a bootstrap commit in data table. 

      lets say we do a bootstrap with data table which will result in just 1 commit. And later if we try to rollback, we will hit the same issue as above. all tests in TestBootstrap fails because of this when metadata is enabled. 

      possible fix:

      We can pass information on current instant being operated on while instantiating metadata table writer and ignore that from inflght while checking for bootstrap pre-requisite. But wondering is there is a better approach. 



        Issue Links



              manojg Manoj Govindassamy
              shivnarayan sivabalan narayanan
              0 Vote for this issue
              3 Start watching this issue