XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.1.0
    • Fix Version/s: 3.1.0
    • Component/s: Structured Streaming
    • Labels:
      None

      Description

      Current stream-stream join supports inner, left outer and right outer join (https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamingSymmetricHashJoinExec.scala#L166 ). With current design of stream-stream join (which marks whether the row is matched or not in state store), it would be very easy to support full outer join as well.

       

      Full outer stream-stream join will work as followed:

      (1).for left side input row, check if there's a match on right side state store. If there's a match, output all matched rows. Put the row in left side state store.

      (2).for right side input row, check if there's a match on left side state store. If there's a match, output all matched rows and update left side rows state with "matched" field to set to true. Put the right side row in right side state store.

      (3).for left side row needs to be evicted from state store, output the row if "matched" field is false.

      (4).for right side row needs to be evicted from state store, output the row if "matched" field is false.

        Attachments

          Activity

            People

            • Assignee:
              chengsu Cheng Su
              Reporter:
              chengsu Cheng Su
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: