Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-19086

Improper scoping of name resolution of columns in HAVING clause

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Not A Problem
    • 2.1.0
    • None
    • SQL
    • None

    Description

      There seems to be a problem on the scoping of name resolution of columns in a HAVING clause.

      Here is a scenario of the problem:

      // A simplified version of TC 01.13 from PR-16337
      Seq((1,1,1)).toDF("t1a", "t1b", "t1c").createOrReplaceTempView("t1")
      Seq((1,1,1)).toDF("t2a", "t2b", "t2c").createOrReplaceTempView("t2")
      
      // This is okay. 
      // Error: t2c is unresolved
      sql("select t2a from t2 group by t2a having t2c = 8").show
      
      // This is okay as t2c is resolved to the t2 on the parent side
      // because t2 in the subquery does not output column t2c.
      sql("select * from t2 where t2a in (select t2a from (select t2a from t2) t2 group by t2a having t2c = 8)").explain(true)
      
      // This is the problem.
      sql("select * from t2 where t2a in (select t2a from t2 group by t2a having t2c = 8)").explain(true)
      
      == Analyzed Logical Plan ==
      t2a: int, t2b: int, t2c: int
      Project [t2a#22, t2b#23, t2c#24]
      +- Filter predicate-subquery#38 [(t2a#22 = t2a#22#49) && (t2c#24 = 8)]
         :  +- Project [t2a#22 AS t2a#22#49]
         :     +- Aggregate [t2a#22], [t2a#22]
         :        +- SubqueryAlias t2, `t2`
         :           +- Project [_1#18 AS t2a#22, _2#19 AS t2b#23, _3#20 AS t2c#24]
         :              +- LocalRelation [_1#18, _2#19, _3#20]
         +- SubqueryAlias t2, `t2`
            +- Project [_1#18 AS t2a#22, _2#19 AS t2b#23, _3#20 AS t2c#24]
               +- LocalRelation [_1#18, _2#19, _3#20]
      

      We should not resolve t2c in the subquery to the outer t2 on the parent side. It should try to resolve t2c to the t2 in the subquery from its current scope and raise an exception because it is invalid to pull up the column t2c from the Aggregate operator below.

      Attachments

        Activity

          People

            Unassigned Unassigned
            nsyca Nattavut Sutyanyong
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: