Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-33338

GROUP BY using literal map should not fail

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.0.2, 2.1.3, 2.2.3, 2.3.4, 2.4.7, 3.0.1, 3.1.0
    • Fix Version/s: 2.4.8, 3.0.2, 3.1.0
    • Component/s: SQL
    • Labels:
      None

      Description

      Apache Spark 2.x ~ 3.0.1 raise`RuntimeException` for the following queries.
      SQL

      CREATE TABLE t USING ORC AS SELECT map('k1', 'v1') m, 'k1' k
      SELECT map('k1', 'v1')[k] FROM t GROUP BY 1
      SELECT map('k1', 'v1')[k] FROM t GROUP BY map('k1', 'v1')[k]
      SELECT map('k1', 'v1')[k] a FROM t GROUP BY a
      

      ERROR

      Caused by: java.lang.RuntimeException: Couldn't find k#3 in [keys: [k1], values: [v1][k#3]#6]
      	at scala.sys.package$.error(package.scala:27)
      	at org.apache.spark.sql.catalyst.expressions.BindReferences$$anonfun$bindReference$1$$anonfun$applyOrElse$1.apply(BoundAttribute.scala:85)
      	at org.apache.spark.sql.catalyst.expressions.BindReferences$$anonfun$bindReference$1$$anonfun$applyOrElse$1.apply(BoundAttribute.scala:79)
      	at org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:52)
      

      This is a regression from Apache Spark 1.6.x.

      scala> sc.version
      res1: String = 1.6.3
      
      scala> sqlContext.sql("SELECT map('k1', 'v1')[k] FROM t GROUP BY map('k1', 'v1')[k]").show
      +---+
      |_c0|
      +---+
      | v1|
      +---+
      

        Attachments

          Activity

            People

            • Assignee:
              dongjoon Dongjoon Hyun
              Reporter:
              dongjoon Dongjoon Hyun
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: