Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-7030

Log reader data lost as that not consistent behavior in timeline's containsInstant

    XMLWordPrintableJSON

Details

    Description

      Log reader filtered all log data blocks which come from inflight instant.

      containsInstant return false when input instant's timestamp is not equal as anyone instant timestamp in inflight timeline.

      But now, in timeline's containsInstant that input is instant's timestamp, it would return true.

       

      When input is the instant with default_millis_ext, instant's timestamp is less than someone instant timestamp in timeline.

      In finally, log reader skipped the completed delta commit instant and caused data lost.

      I think timeline's containsInstant should have consistent behavior and update containsOrBeforeTimelineStarts to containsInstant

      Attachments

        1. image-2023-11-03-20-07-30-201.png
          120 kB
          ann
        2. image-2023-11-03-20-06-13-905.png
          733 kB
          ann
        3. image-2023-11-03-20-06-00-579.png
          733 kB
          ann
        4. image-2023-11-03-19-58-39-495.png
          476 kB
          ann
        5. image-2023-11-03-19-50-11-849.png
          226 kB
          ann
        6. image-2023-11-03-19-49-22-894.png
          314 kB
          ann
        7. image-2023-11-03-19-48-29-441.png
          645 kB
          ann

        Issue Links

          Activity

            People

              Unassigned Unassigned
              xoln ann
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: