Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-36630

Add the option to use physical statistics to avoid large tables being broadcast

    XMLWordPrintableJSON

Details

    • Question
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 3.2.0
    • None
    • SQL
    • None

    Description

      Currently when AQE's queryStage is not materialized, it uses the stats of the logical plan to estimate whether the plan can be converted to BHJ, and in some scenarios the estimated value is several orders of magnitude smaller than the actual broadcast data, which can lead to large tables being broadcast

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              gaoyajun02 gaoyajun02
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: