Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-3395

[SQL] DSL uses incorrect attribute ids after a distinct()

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • None
    • 1.2.0
    • SQL
    • None

    Description

      In the following example,

      val rdd = ... // two columns:

      {key, value}

      val derivedRDD = rdd.distinct().limit(1)

      sql("explain select * from rdd inner join derivedRDD on rdd.key = derivedRDD.key")

      The inner join executes incorrectly since the two keys end up with the same attribute id after analysis.

      Attachments

        Activity

          People

            ekhliang Eric Liang
            ekhliang Eric Liang
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: