Details
-
New Feature
-
Status: Closed
-
Major
-
Resolution: Won't Do
-
None
-
None
-
None
Description
Currently, DataSets and DataStreams cannot be joined with each other. This feature should include the following:
- extend Streaming API to allow one join input to be a DataSet
- in a first step, DataSet can be limited to be a DataSource
- later on, full Flink program could compute DataSet
-> maybe, Flink program be used update Join-DataSet periodically (in base data changed); including "synchonized" switching from old to new DataSet; update triggered by user/time/base-data-change?
- in first version, inner-equi join should be sufficient
- DataSet is used as build side for Hash-Join
- extend current Hash-Join to consume DataStream as probe input
- for full programs computing DataSet input, it might be helpful to extend optimizer ?
- What about other joins? What join algorithm do we need to support (full/left/right) outer joins for Set-Stream-Join? What about theta joins?
Attachments
Issue Links
- is related to
-
FLINK-3514 Add support for slowly changing streaming broadcast variables
- Open