Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-2320

Enable DataSet DataStream Joins

Agile BoardRank to TopRank to BottomAttach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskConvert to sub-taskLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Closed
    • Priority: Major
    • Resolution: Won't Do
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:
      None

      Description

      Currently, DataSets and DataStreams cannot be joined with each other. This feature should include the following:

      • extend Streaming API to allow one join input to be a DataSet
      • in a first step, DataSet can be limited to be a DataSource
      • later on, full Flink program could compute DataSet
        -> maybe, Flink program be used update Join-DataSet periodically (in base data changed); including "synchonized" switching from old to new DataSet; update triggered by user/time/base-data-change?
      • in first version, inner-equi join should be sufficient
      • DataSet is used as build side for Hash-Join
      • extend current Hash-Join to consume DataStream as probe input
      • for full programs computing DataSet input, it might be helpful to extend optimizer ?
      • What about other joins? What join algorithm do we need to support (full/left/right) outer joins for Set-Stream-Join? What about theta joins?

        Attachments

        Issue Links

          Activity

          $i18n.getText('security.level.explanation', $currentSelection) Viewable by All Users
          Cancel

            People

            • Assignee:
              Unassigned
              Reporter:
              mjsax Matthias J. Sax

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment