Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-35553 Improve correlated subqueries
  3. SPARK-34974

Improve the subquery decorrelation framework

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.2.0
    • 3.2.0
    • SQL
    • None

    Description

      Currently, Spark has limited support for correlated subqueries. It only allows Filter to have outer column references and does not support non-equality predicates when the subquery is aggregated. Please see https://issues.apache.org/jira/browse/SPARK-18455 for more details on how subquery decorrelation works in Spark.

      The task here is to implement the technique introduced in "Unnesting Arbitrary Queries" by T. Neumann; A. Kemper
      (http://www.btw-2015.de/res/proceedings/Hauptband/Wiss/Neumann-Unnesting_Arbitrary_Querie.pdf). This allows Spark to have outer query column references other than Filter, decorrelate subqueries with more types of operators, and support correlated non-equality predicates.

      This will also enable supporting lateral derived table (https://issues.apache.org/jira/browse/SPARK-34382) in the future.

      Attachments

        Activity

          People

            allisonwang-db Allison Wang
            allisonwang-db Allison Wang
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: