[FLINK-34442] Support optimizations for pre-partitioned [external] data sources - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: 1.18.1
Fix Version/s: None
Component/s: Table SQL / API, Table SQL / Planner
Labels:
- pull-request-available

Description

There are some use-cases in which data sources are pre-partitioned:

Kafka broker is already partitioned w.r.t. some key[s]
There are multiple [Flink] jobs that materialize their outputs and read them as input subsequently

One of the main benefits is that we might avoid unnecessary shuffling.
There is already an experimental feature in DataStream to support a subset of these [1].
We should support this for Flink Table/SQL as well.

[1] https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/experimental/

Attachments

Issue Links

links to

GitHub Pull Request #24437

mentioned in: Page Loading...

Activity

People

Assignee:: Jeyhun Karimov

Reporter:: Jeyhun Karimov

Votes:: 1 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 13/Feb/24 18:21

Updated:: 07/Mar/24 09:57