Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-10740 FLIP-27: Refactor Source Interface
  3. FLINK-19162

Allow Split Reader based sources to reuse record batches

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 1.11.3, 1.12.0
    • Connectors / Common
    • None

    Description

      The Split Readers hand over a batch of records at a time from the I/O thread (fetching and decoding) to the main operator processing thread.

       These structures can memory intensive and expensive and performance greatly benefits from reusing them. This is especially true for high-performance format readers like ORC and Parquet.

      While previous sources (where I/O was in the main thread) could reuse objects in a trivial manner, the new Split Reader API (with multiple threads) needs an explicit recycle() hook to allow returning/reusing these objects.

      Attachments

        Activity

          People

            sewen Stephan Ewen
            sewen Stephan Ewen
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: