Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-1521 [UMBRELLA] RFC-24 HUDI Flink writer proposal
  3. HUDI-1598

Write as minor batches during one checkpoint interval for the new writer

    XMLWordPrintableJSON

Details

    Description

      Buffering data during one checkpoint when flush the buffer out all at a time is not resource friendly for streaming write. The more proper way it to cut the batches based on their real memory data buffer size (say, 128Mb), the writer always flushes the buffer out when its size reaches the configured threshold.

      Thus, after this change, one instant may span one (if every checkpoint succeeds) or more (if there are checkpoint failures) checkpoints. The instant only commits when there is a successful checkpoint.

      Attachments

        Issue Links

          Activity

            People

              danny0405 Danny Chen
              danny0405 Danny Chen
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: