Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-46526

Limit over certain correlated subqueries results in Nosuchelement exception

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.4.0
    • 4.0.0
    • SQL

    Description

      The types of queries that result in errors are:

      • have LIMIT in the subquery
      • predicate with correlated references does not depend on the inner query (references exclusively outer table).

      For example:

      SELECT COUNT(DISTINCT(t1a))
      FROM t1
      WHERE t1d IN (SELECT t2d
                    FROM   t2
                    WHERE t1a IS NOT NULL
                    LIMIT 10);
       

      Here, WHERE t1a IS NOT NULL can be conceptually lifted to the join that connects inner and outer query. 

      Currently, this query results in an error ("no such element exception").

      Attachments

        Issue Links

          Activity

            People

              gubichev Andrey Gubichev
              gubichev Andrey Gubichev
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: