Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-33828 SQL Adaptive Query Execution QA
  3. SPARK-36414

Disable timeout for BroadcastQueryStageExec in AQE

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.0.3, 3.1.2, 3.2.0, 3.3.0
    • 3.2.0
    • SQL
    • None

    Description

      This reverts SPARK-31475, as there are always more concurrent jobs running in AQE mode, especially when running multiple queries at the same time. Currently, the broadcast timeout does not record accurately for the BroadcastQueryStageExec only but also the time waiting for being scheduled. If all the resources are currently being occupied for materializing other stages, it timeouts without a chance to run actually.

       

       

      The default value is 300s, and it's hard to adjust the timeout for AQE mode. Usually, you need an extremely large number for real-world cases. As you can see the example, above, the timeout we used for it is 1800s, and obviously, it needs 3x more or something

       

      Attachments

        Issue Links

          Activity

            People

              Qin Yao Kent Yao
              Qin Yao Kent Yao
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: