Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-40429

Only set KeyGroupedPartitioning when the referenced column is in the output

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 3.3.0, 3.4.0
    • 3.3.1, 3.4.0
    • SQL
    • None

    Description

            sql(s"CREATE TABLE $tbl (id bigint, data string) PARTITIONED BY (id)")
            sql(s"INSERT INTO $tbl VALUES (1, 'a'), (2, 'b'), (3, 'c')")
            checkAnswer(
              spark.table(tbl).select("index", "_partition"),
              Seq(Row(0, "3"), Row(0, "2"), Row(0, "1"))
            )
      

      failed with
      ScalaTestFailureLocation: org.apache.spark.sql.QueryTest at (QueryTest.scala:226)
      org.scalatest.exceptions.TestFailedException: AttributeSet(id#994L) was not empty The optimized logical plan has missing inputs:
      RelationV2index#998, _partition#999 testcat.t

      Attachments

        Activity

          People

            huaxingao Huaxin Gao
            huaxingao Huaxin Gao
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: