Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-34476

Duplicate referenceNames are given for ambiguousReferences

    XMLWordPrintableJSON

Details

    • Bug
    • Status: In Progress
    • Major
    • Resolution: Unresolved
    • 3.0.0
    • None
    • Spark Core
    • None

    Description

      When running test with Spark extension that converts custom function to json path expression, I saw the following in test output:

      2021-02-19 21:57:24,550 (Time-limited test) [INFO - org.yb.loadtest.TestSpark3Jsonb.testJsonb(TestSpark3Jsonb.java:102)] plan is == Physical Plan ==
      org.apache.spark.sql.AnalysisException: Reference 'phone->'key'->1->'m'->2->>'b'' is ambiguous, could be: mycatalog.test.person.phone->'key'->1->'m'->2->>'b', mycatalog.test.person.phone->'key'->1->'m'->2->>'b'.; line 1 pos 8
      

      Please note the candidates following 'could be' are the same.
      Here is the physical plan for a working query where phone is a jsonb column:

      TakeOrderedAndProject(limit=2, orderBy=[id#6 ASC NULLS FIRST], output=[id#6,address#7,key#0])
      +- *(1) Project [id#6, address#7, phone->'key'->1->'m'->2->'b'#12 AS key#0]
         +- BatchScan[id#6, address#7, phone->'key'->1->'m'->2->'b'#12] Cassandra Scan: test.person
       - Cassandra Filters: [[phone->'key'->1->'m'->2->>'b' >= ?, 100]]
       - Requested Columns: [id,address,phone->'key'->1->'m'->2->'b']
      

      The difference for the failed query is that it tries to use

      phone->'key'->1->'m'->2->>'b'

      in the projection (which works as part of filter).

      Attachments

        Activity

          People

            Unassigned Unassigned
            yuzhihong@gmail.com Ted Yu
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: