Currently, DataSets and DataStreams cannot be joined with each other. This feature should include the following:
- extend Streaming API to allow one join input to be a DataSet
- in a first step, DataSet can be limited to be a DataSource
- later on, full Flink program could compute DataSet
-> maybe, Flink program be used update Join-DataSet periodically (in base data changed); including "synchonized" switching from old to new DataSet; update triggered by user/time/base-data-change?
- in first version, inner-equi join should be sufficient
- DataSet is used as build side for Hash-Join
- extend current Hash-Join to consume DataStream as probe input
- for full programs computing DataSet input, it might be helpful to extend optimizer ?
- What about other joins? What join algorithm do we need to support (full/left/right) outer joins for Set-Stream-Join? What about theta joins?