Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-35553 Improve correlated subqueries
  3. SPARK-41441

Allow Generate with no required child output to host outer references

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.4.0
    • 3.4.0
    • SQL
    • None

    Description

      Currently, in CheckAnalysis, Spark disallows Generate to host any outer references when it's required child output is not empty. But when the child output is empty, it can host outer references, which DecorrelateInnerQuery does not handle.

      For example,

      select * from t, lateral (select explode(array(c1, c2)))

      This throws an internal error :

      Caused by: java.lang.AssertionError: assertion failed: Correlated column is not allowed in Generate explode(array(outer(c1#219), outer(c2#220))), false, [col#221] +- OneRowRelation

       We should support Generate to host outer references when its required child output is empty.

      Attachments

        Activity

          People

            allisonwang-db Allison Wang
            allisonwang-db Allison Wang
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: