Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-3538

DataStream join API does not enforce consistent usage

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.0.0
    • 1.0.0
    • None

    Description

      In the Scala DataStream API the join operation does not enforce that the user has specified a KeySelector for both input sides before applying a window function. Moreover, the order of the where and equalTo clause is not fixed and it is possible to specify multiple where and equalTo clauses. In the latter case, it is not clear which KeySelector will eventually be used by the system.

      So the following Flink programs compile without a compilation problem (the first two lines will only fail at runtime):

      inputA.join(inputB).equalTo{x => x}.window(TumblingProcessingTimeWindows.of(Time.seconds(10)))
            .apply(new DefaultFlatJoinFunction[String, String]()).print()
      
      inputA.join(inputB).where{x => x}.window(TumblingProcessingTimeWindows.of(Time.seconds(10)))
            .apply(new DefaultFlatJoinFunction[String, String]()).print()
      
      inputA.join(inputB).equalTo{x => x}.where{x => x}.where{x => "1"}.equalTo{x => "42"}.window(TumblingProcessingTimeWindows.of(Time.seconds(10)))
            .apply(new DefaultFlatJoinFunction[String, String]()).print()
      

      This is unlike the Java DataStream API where a clear pattern of join then where and then equalTo is enforced. I would propose to do the same for the Scala API.

      Attachments

        Activity

          People

            aljoscha Aljoscha Krettek
            trohrmann Till Rohrmann
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: