Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-34763

col(), $"<name>" and df("name") should handle quoted column names properly.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.1.1, 3.2.0
    • 3.0.3, 3.1.2, 3.2.0
    • SQL
    • None

    Description

      Quoted column names like `a``b.c` cannot be represented with col(), $"<name>" and df("") because they don't handle such column names properly.

      For example, if we have a following DataFrame.

      val df1 = spark.sql("SELECT 'col1' AS `a``b.c`")
      

      For the DataFrame, this query is successfully executed.

      scala> df1.selectExpr("`a``b.c`").show
      +-----+
      |a`b.c|
      +-----+
      | col1|
      +-----+
      

      But the following query will fail because df1("`a``b.c`") throws an exception.

      scala> df1.select(df1("`a``b.c`")).show
      org.apache.spark.sql.AnalysisException: syntax error in attribute name: `a``b.c`;
        at org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute$.e$1(unresolved.scala:152)
        at org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute$.parseAttributeName(unresolved.scala:162)
        at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveQuoted(LogicalPlan.scala:121)
        at org.apache.spark.sql.Dataset.resolve(Dataset.scala:221)
        at org.apache.spark.sql.Dataset.col(Dataset.scala:1274)
        at org.apache.spark.sql.Dataset.apply(Dataset.scala:1241)
        ... 49 elided
      

      Attachments

        Activity

          People

            sarutak Kousuke Saruta
            sarutak Kousuke Saruta
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: