[SPARK-24781] Using a reference from Dataset in Filter/Sort might not work. - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Blocker
Resolution: Fixed
Affects Version/s: 2.3.1
Fix Version/s: 2.3.2, 2.4.0
Component/s: SQL
Labels:
None

Target Version/s:

2.3.2

Description

When we use a reference from Dataset in filter or sort, which was not used in the prior select, an AnalysisException occurs, e.g.,

val df = Seq(("test1", 0), ("test2", 1)).toDF("name", "id")
df.select(df("name")).filter(df("id") === 0).show()

org.apache.spark.sql.AnalysisException: Resolved attribute(s) id#6 missing from name#5 in operator !Filter (id#6 = 0).;;
!Filter (id#6 = 0)
   +- AnalysisBarrier
      +- Project [name#5]
         +- Project [_1#2 AS name#5, _2#3 AS id#6]
            +- LocalRelation [_1#2, _2#3]

If we use col instead, it works:

val df = Seq(("test1", 0), ("test2", 1)).toDF("name", "id")
df.select(col("name")).filter(col("id") === 0).show()

Attachments

Issue Links

links to

[Github] Pull Request #21745 (viirya)

Activity

People

Assignee:: L. C. Hsieh

Reporter:: Takuya Ueshin

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Dates

Created:: 11/Jul/18 05:07

Updated:: 13/Jul/18 15:26

Resolved:: 13/Jul/18 15:26