[SPARK-34476] Duplicate referenceNames are given for ambiguousReferences - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: In Progress
Priority: Major
Resolution: Unresolved
Affects Version/s: 3.0.0
Fix Version/s: None
Component/s: Spark Core
Labels:
None

Description

When running test with Spark extension that converts custom function to json path expression, I saw the following in test output:

2021-02-19 21:57:24,550 (Time-limited test) [INFO - org.yb.loadtest.TestSpark3Jsonb.testJsonb(TestSpark3Jsonb.java:102)] plan is == Physical Plan ==
org.apache.spark.sql.AnalysisException: Reference 'phone->'key'->1->'m'->2->>'b'' is ambiguous, could be: mycatalog.test.person.phone->'key'->1->'m'->2->>'b', mycatalog.test.person.phone->'key'->1->'m'->2->>'b'.; line 1 pos 8

Please note the candidates following 'could be' are the same.
Here is the physical plan for a working query where phone is a jsonb column:

TakeOrderedAndProject(limit=2, orderBy=[id#6 ASC NULLS FIRST], output=[id#6,address#7,key#0])
+- *(1) Project [id#6, address#7, phone->'key'->1->'m'->2->'b'#12 AS key#0]
   +- BatchScan[id#6, address#7, phone->'key'->1->'m'->2->'b'#12] Cassandra Scan: test.person
 - Cassandra Filters: [[phone->'key'->1->'m'->2->>'b' >= ?, 100]]
 - Requested Columns: [id,address,phone->'key'->1->'m'->2->'b']

The difference for the failed query is that it tries to use

phone->'key'->1->'m'->2->>'b'

in the projection (which works as part of filter).

Attachments

Issue Links

links to

[Github] Pull Request #31613 (tedyu)

Activity

People

Assignee:: Unassigned

Reporter:: Ted Yu

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 19/Feb/21 22:17

Updated:: 22/Feb/21 17:02