Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-6959

Do not rollback current instant when bulk insert as row failed

    XMLWordPrintableJSON

Details

    Description

      When org.apache.hudi.spark3.internal.HoodieDataSourceInternalBatchWrite#abort is called, all the subtasks may not have already been canceled. So if we rollback current instant immediately, there may be new files been written after rollback scheduled, which will cause dirty data.

       

      We should rollback the failed instant using common mechanism eager and lazy 

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              Qijun Fu Qijun Fu
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: