Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-8478 Add support for time-bounded joins in DataStream API
  3. FLINK-8482

Implement and expose option to use min / max / left / right timestamp for joined streamrecords

    XMLWordPrintableJSON

Details

    Description

      The idea: Expose the option of which timestamp to use for the result of a join. The idea that is currently the floating around includes the options

      • left: Use timestamp of the element in a join that came from the left stream
      • right: Use timestamp of the element in a join that came from the right stream
      • max: Use the max timestamp of both elements in a join
      • min: Use the max timestamp of both elements in a join

      All options but max require to introduce delaying watermarks in the operator, which is something that we were hesitant to do until now. This should probably under go discussion once more in order to see if / how we want to add this now. We could even think of exposing this in a more general way by adding a base operator that allows delayed watermarks.

      This will also be groundwork for supporting outer joins (FLINK-8483) for which in any case we watermark delays to provide correctness. 

      Also the API for this needs some feedback in order to expose this in a powerful, yet clear way. In my PoC at [1] I used the naming convention left / right to refer to specific streams with currently is not something the api exposes to the user, we should probably use something more clever here.

      Example

      keyedStreamOne.
         .intervalJoin(keyedStreamTwo)
         .between(Time.milliseconds(0), Time.milliseconds(2))
         .assignMinTimestamp() // alternative .assignMaxTimestamp() .assignLeftTimestamp() .assignRightTimestamp()
         .process(new ProcessJoinFunction() { /* impl */ })
      

       

      Any feedback is highly appreciated!

      [1] https://github.com/florianschmidt1994/flink/tree/flink-8482-add-option-for-different-timestamp-strategies-to-interval-join-operator

      cc StephanEwen kkl0u

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              florianschmidt Florian Schmidt
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: