Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-34442

Support optimizations for pre-partitioned [external] data sources

    XMLWordPrintableJSON

Details

    Description

      There are some use-cases in which data sources are pre-partitioned:

      • Kafka broker is already partitioned w.r.t. some key[s]
      • There are multiple [Flink] jobs that materialize their outputs and read them as input subsequently

      One of the main benefits is that we might avoid unnecessary shuffling.
      There is already an experimental feature in DataStream to support a subset of these [1].
      We should support this for Flink Table/SQL as well.

      [1] https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/experimental/

      Attachments

        Issue Links

          Activity

            People

              jeyhunkarimov Jeyhun Karimov
              jeyhunkarimov Jeyhun Karimov
              Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated: