Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-22587

Support aggregations in batch mode with DataStream API

Attach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 1.12.0, 1.13.0
    • Fix Version/s: None
    • Component/s: API / DataStream
    • Labels:
      None

      Description

      A pipeline like this in batch mode would output no data

      stream.join(otherStream)
          .where(<KeySelector>)
          .equalTo(<KeySelector>)
          .window(GlobalWindows.create())
          .apply(<JoinFunction>)
      

      Indeed the default trigger for GlobalWindow is NeverTrigger which never fires. If we set a EventTimeTrigger it will fire with every element as the watermark will be set to +INF (batch mode) and will pass the end of the global window with each new element. A ProcessingTimeTrigger never fires either and all elapsed time or delta based triggers would not be suited for batch.

      Same goes for reduce() instead of join().

      So I guess we miss something for batch support with DataStream.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              echauchot Etienne Chauchot

              Dates

              • Created:
                Updated:

                Issue deployment