Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-6962

Correct the behavior of bulk insert for NB-CC

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 1.0.0-beta1
    • None

    Description

      How to handle the case if the multiple writer contains a job with bulk insert operation?
      1. Generated file group id: Generate a fixed file group ID because other jobs will use the fixed file group id suffix instead of random uuid suffix. The behavior needs to be consistent to prevent later writer jobs from writing the records with same primary key to different file groups.
      2.Deal with the transaction: The conflict resolution of bulk insert could not defer to the compaction phase. Because bulk insert writers flush data into base files, if there are multiple bulk insert job, there might exists multiple base files in the same bucket.

      Attachments

        Issue Links

          Activity

            People

              jingzhang Jing Zhang
              jingzhang Jing Zhang
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: