Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-6724

Initializing prevInstance to HoodieTimeline.INIT_INSTANT_TS to avoid partial reading of first commit

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.14.0
    • None

    Description

      Since object based incr jobs now have batching with in the commit, we can end-up in a situation for the first commit where prevInstance is same as startInstance according to existing code for batches within the first commit. 

      In this scenario when we incremental query rows > prevInstance, we will skip the first commit as startInstance is also pointing to the same commit.

      This is due to defaulting prevInstance to startInstance in 
      generateQueryInfo API. 

      Fix is to have this default to HoodieTimeline.INIT_INSTANT_TS so batching can continue on the first commit

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              linlok Lokesh Lingarajan
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: