Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
1.0.0
-
None
Description
In the Scala DataStream API the join operation does not enforce that the user has specified a KeySelector for both input sides before applying a window function. Moreover, the order of the where and equalTo clause is not fixed and it is possible to specify multiple where and equalTo clauses. In the latter case, it is not clear which KeySelector will eventually be used by the system.
So the following Flink programs compile without a compilation problem (the first two lines will only fail at runtime):
inputA.join(inputB).equalTo{x => x}.window(TumblingProcessingTimeWindows.of(Time.seconds(10))) .apply(new DefaultFlatJoinFunction[String, String]()).print() inputA.join(inputB).where{x => x}.window(TumblingProcessingTimeWindows.of(Time.seconds(10))) .apply(new DefaultFlatJoinFunction[String, String]()).print() inputA.join(inputB).equalTo{x => x}.where{x => x}.where{x => "1"}.equalTo{x => "42"}.window(TumblingProcessingTimeWindows.of(Time.seconds(10))) .apply(new DefaultFlatJoinFunction[String, String]()).print()
This is unlike the Java DataStream API where a clear pattern of join then where and then equalTo is enforced. I would propose to do the same for the Scala API.