Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-47070

Subquery rewrite inside an aggregation makes an aggregation invalid

    XMLWordPrintableJSON

Details

    Description

      When an in/exists-subquery appears inside an aggregate expression within a top-level GROUP BY, it gets rewritten and a new `exists` variable is introduced. However, this variable is incorrectly handled in aggregation. For example, consider the following query:

      ```
      SELECT
      CASE
      WHEN t1.id IN (SELECT id FROM t2) THEN 10
      ELSE -10
      END AS v1
      FROM t1
      GROUP BY t1.id;
      ```
       
      Executing it leads to the following error:
      ```
      java.lang.IllegalArgumentException: Cannot find column index for attribute 'exists#844' in: Map()
      ```

      Attachments

        Activity

          People

            anton.kirillov Anton Kirillov
            antonlykov Anton Lykov
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: