Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-34681

Full outer shuffled hash join when building left side produces wrong result

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 3.1.0, 3.1.1, 3.2.0
    • Fix Version/s: 3.1.2, 3.2.0
    • Component/s: SQL
    • Labels:

      Description

      For full outer shuffled hash join with building hash map on left side, and having non-equal condition, the join can produce wrong result.

      The root cause is `boundCondition` in `HashJoin.scala` always assumes the left side row is `streamedPlan` and right side row is `buildPlan` (streamedPlan.output ++ buildPlan.output). This is valid assumption, except for full outer + build left case.

      The fix is to correct `boundCondition` in `HashJoin.scala` to handle full outer + build left case properly. See reproduce in https://issues.apache.org/jira/browse/SPARK-32399?focusedCommentId=17298414&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17298414 .

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                chengsu Cheng Su
                Reporter:
                chengsu Cheng Su
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: