Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-2320

Enable DataSet DataStream Joins

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Won't Do
    • None
    • None
    • None

    Description

      Currently, DataSets and DataStreams cannot be joined with each other. This feature should include the following:

      • extend Streaming API to allow one join input to be a DataSet
      • in a first step, DataSet can be limited to be a DataSource
      • later on, full Flink program could compute DataSet
        -> maybe, Flink program be used update Join-DataSet periodically (in base data changed); including "synchonized" switching from old to new DataSet; update triggered by user/time/base-data-change?
      • in first version, inner-equi join should be sufficient
      • DataSet is used as build side for Hash-Join
      • extend current Hash-Join to consume DataStream as probe input
      • for full programs computing DataSet input, it might be helpful to extend optimizer ?
      • What about other joins? What join algorithm do we need to support (full/left/right) outer joins for Set-Stream-Join? What about theta joins?

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              mjsax Matthias J. Sax
              Votes:
              14 Vote for this issue
              Watchers:
              19 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: