Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-30088

Adaptive execution should convert SortMergeJoin to BroadcastJoin when plan generates empty result

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 3.1.0
    • None
    • SQL
    • None

    Description

      Adaptive execution try to  convert SortMergeJoin to BroadcastJoin by checking the `dataSize` metrics of spark plan. However, if the spark plan generates empty result, the `dataSize` metrics is empty due to SQLMetrics's initial value is -1, which could lead to the following check return false

      ```

      private def canBroadcast(plan: LogicalPlan): Boolean =

      { plan.stats.sizeInBytes >= 0 && plan.stats.sizeInBytes <= conf.autoBroadcastJoinThreshold }

      ```

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              EdisonWang EdisonWang
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: