Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-35553

Improve correlated subqueries

    XMLWordPrintableJSON

Details

    • Umbrella
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.2.0, 3.3.0, 3.4.0
    • None
    • SQL
    • None

    Description

      This umbrella ticket is used to keep track of correlated subquery improvements and bug fixes. 

      The goal is to gradually remove the restrictions in CheckAnalysis for correlated subqueries.

      Attachments

        1.
        Improve the subquery decorrelation framework Sub-task Resolved Allison Wang
        2.
        Support outer references in Aggregate Sub-task Open Unassigned
        3.
        Support subqueries with correlated non-equality predicates Sub-task Resolved Allison Wang
        4.
        Support set operators to be on correlation paths Sub-task Resolved Allison Wang
        5.
        Support ORDER BY and LIMIT to be on the correlation path Sub-task Resolved Allison Wang
        6.
        Allow Project to host outer references in scalar subqueries Sub-task Resolved Allison Wang
        7.
        Optimize OneRowRelation subqueries Sub-task Resolved Allison Wang
        8.
        Enable DecorrelateInnerQuery for IN/EXISTS subqueries Sub-task Resolved Andrey Gubichev
        9.
        Unify the logic to handle COUNT bug for scalar and lateral subqueries Sub-task Open Unassigned
        10.
        Handle the COUNT bug for correlated IN/EXISTS subquery Sub-task Resolved Unassigned
        11.
        ResolveSubquery should skip unary node UnresolvedHaving Sub-task Open Unassigned
        12.
        Join can become unresolved after PullupCorrelatedPredicates Sub-task Open Unassigned
        13.
        CollapseProject should not collapse correlated scalar subqueries Sub-task Resolved Allison Wang
        14.
        Do not collapse Project with Aggregate when correlated subqueries are present in the project list Sub-task Resolved Allison Wang
        15.
        Disallow distinct aggregate in lateral subqueries with unsupported correlated predicates Sub-task Resolved Allison Wang
        16.
        Allow safe up-cast expressions in correlated equality predicates Sub-task Resolved Allison Wang
        17.
        Check unsupported data type when decorrelating subqueries Sub-task Resolved Allison Wang
        18.
        Refactor checkCorrelationsInSubquery Sub-task Resolved Allison Wang
        19.
        Always inline expressions in OptimizeOneRowRelationSubquery Sub-task Resolved Allison Wang
        20.
        Unexpected operators when rewriting scalar subqueries with non-deterministic expressions Sub-task Resolved Allison Wang
        21.
        Allow Generate with no required child output to host outer references Sub-task Resolved Allison Wang
        22.
        Should not handle the COUNT bug when the GROUP BY clause of a correlated scalar subquery is non-empty Sub-task Resolved Jack Chen
        23.
        Correctness COUNT bug in correlated scalar subselect with `COUNT(*) is null` Sub-task Resolved Jack Chen
        24.
        Improve scalar subquery logic plan when result are literal Sub-task Open Unassigned
        25.
        Correlated EXISTS subqueries in join ON condition unsupported and fail with internal error Sub-task Resolved Jack Chen
        26.
        Loosen `correlated scalar subqueries must be aggregated` error by doing runtime check for scalar subqueries output rowcount Sub-task Open Unassigned
        27.
        Scalar subquery with group-by and non-equality predicate incorrectly allowed, wrong results Sub-task Resolved Jack Chen
        28.
        Support scalar subquery with group-by on column equal to constant Sub-task Resolved Jack Chen

        Activity

          People

            Unassigned Unassigned
            allisonwang-db Allison Wang
            Votes:
            1 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated: