Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-3109

Join two streams with two different buffer time

    XMLWordPrintableJSON

Details

    • Patch

    Description

      Current Flink streaming only supports join two streams on the same window. How to solve this problem?
      For example, there are two streams. One is advertisements showed to users. The tuple in which could be described as (id, showed timestamp). The other one is click stream – (id, clicked timestamp). We want get a joined stream, which includes all the advertisement that is clicked by user in 20 minutes after showed.
      It is possible that after an advertisement is shown, some user click it immediately. It is possible that "click" message arrives server earlier than "show" message because of Internet delay. We assume that the maximum delay is one minute.

      Then the need is that we should alway keep a buffer(20 mins) of "show" stream and another buffer(1 min) of "click" stream.

      It would be grate that there is such an API like.
      showStream.join(clickStream)
      .where(keySelector)
      .buffer(Time.of(20, TimeUnit.MINUTES))
      .equalTo(keySelector)
      .buffer(Time.of(1, TimeUnit.MINUTES))
      .apply(JoinFunction)

      http://stackoverflow.com/questions/33849462/how-to-avoid-repeated-tuples-in-flink-slide-window-join/34024149#34024149

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              yangjun.wang90@gmail.com Wang Yangjun
              Votes:
              3 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - 48h
                  48h
                  Remaining:
                  Remaining Estimate - 47h 50m
                  47h 50m
                  Logged:
                  Remaining Estimate - 47h 50m
                  10m