[SPARK-34763] col(), $"<name>" and df("name") should handle quoted column names properly. - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 3.1.1, 3.2.0
Fix Version/s: 3.0.3, 3.1.2, 3.2.0
Component/s: SQL
Labels:
None

Description

Quoted column names like `a``b.c` cannot be represented with col(), $"<name>" and df("") because they don't handle such column names properly.

For example, if we have a following DataFrame.

val df1 = spark.sql("SELECT 'col1' AS `a``b.c`")

For the DataFrame, this query is successfully executed.

scala> df1.selectExpr("`a``b.c`").show
+-----+
|a`b.c|
+-----+
| col1|
+-----+

But the following query will fail because df1("`a``b.c`") throws an exception.

scala> df1.select(df1("`a``b.c`")).show
org.apache.spark.sql.AnalysisException: syntax error in attribute name: `a``b.c`;
  at org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute$.e$1(unresolved.scala:152)
  at org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute$.parseAttributeName(unresolved.scala:162)
  at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveQuoted(LogicalPlan.scala:121)
  at org.apache.spark.sql.Dataset.resolve(Dataset.scala:221)
  at org.apache.spark.sql.Dataset.col(Dataset.scala:1274)
  at org.apache.spark.sql.Dataset.apply(Dataset.scala:1241)
  ... 49 elided

Attachments

Issue Links

links to

[Github] Pull Request #31854 (sarutak)

Activity

People

Assignee:: Kousuke Saruta

Reporter:: Kousuke Saruta

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 16/Mar/21 15:17

Updated:: 20/Dec/21 09:26

Resolved:: 24/Mar/21 05:44