Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-10740 FLIP-27: Refactor Source Interface
  3. FLINK-19162

Allow Split Reader based sources to reuse record batches

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.12.0, 1.11.3
    • Component/s: Connectors / Common
    • Labels:
      None

      Description

      The Split Readers hand over a batch of records at a time from the I/O thread (fetching and decoding) to the main operator processing thread.

       These structures can memory intensive and expensive and performance greatly benefits from reusing them. This is especially true for high-performance format readers like ORC and Parquet.

      While previous sources (where I/O was in the main thread) could reuse objects in a trivial manner, the new Split Reader API (with multiple threads) needs an explicit recycle() hook to allow returning/reusing these objects.

        Attachments

          Activity

            People

            • Assignee:
              sewen Stephan Ewen
              Reporter:
              sewen Stephan Ewen
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: