Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-4769

Option read.streaming.skip_compaction skips delta commit

    XMLWordPrintableJSON

Details

    • 3

    Description

      Option read.streaming.skip_compaction was introduced to avoid consuming duplicate data from delta-commits and compactions in MOR table.

      But the option may cause delta-commits, here the case:

      Support we have a timeline (d for delta-commit, C for compaction/commit):

      d1 --> d2 --> C3 --> d4 --> d5 -->

      t1.......................................................t2..........

      Let's say scans for streaming read happen at time t1 and t2, when d1 and d5 is the latest instant seperately. 

      When we scan at t2 with read.streaming.skip_compaction=true, we get a latest merged fileslice with only log files containing d4+d5.  So d2 is skipped.

      Attachments

        Activity

          People

            codope Sagar Sumit
            nonggia nonggia.liang
            Danny Chen
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: