[SPARK-22267] Spark SQL incorrectly reads ORC file when column order is different - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 1.6.3, 2.0.2, 2.1.0, 2.2.0
Fix Version/s: 2.3.0
Component/s: SQL
Labels:
None

Description

For a long time, Apache Spark SQL returns incorrect results when ORC file schema is different from metastore schema order.

scala> Seq(1 -> 2).toDF("c1", "c2").write.format("parquet").mode("overwrite").save("/tmp/p")
scala> Seq(1 -> 2).toDF("c1", "c2").write.format("orc").mode("overwrite").save("/tmp/o")
scala> sql("CREATE EXTERNAL TABLE p(c2 INT, c1 INT) STORED AS parquet LOCATION '/tmp/p'")
scala> sql("CREATE EXTERNAL TABLE o(c2 INT, c1 INT) STORED AS orc LOCATION '/tmp/o'")
scala> spark.table("p").show  // Parquet is good.
+---+---+
| c2| c1|
+---+---+
|  2|  1|
+---+---+
scala> spark.table("o").show    // This is wrong.
+---+---+
| c2| c1|
+---+---+
|  1|  2|
+---+---+
scala> spark.read.orc("/tmp/o").show  // This is correct.
+---+---+
| c1| c2|
+---+---+
|  1|  2|
+---+---+

TESTCASE

  test("SPARK-22267 Spark SQL incorrectly reads ORC files when column order is different") {
    withTempDir { dir =>
      val path = dir.getCanonicalPath

      Seq(1 -> 2).toDF("c1", "c2").write.format("orc").mode("overwrite").save(path)
      checkAnswer(spark.read.orc(path), Row(1, 2))

      Seq("true", "false").foreach { value =>
        withTable("t") {
          withSQLConf(HiveUtils.CONVERT_METASTORE_ORC.key -> value) {
            sql(s"CREATE EXTERNAL TABLE t(c2 INT, c1 INT) STORED AS ORC LOCATION '$path'")
            checkAnswer(spark.table("t"), Row(2, 1))
          }
        }
      }
    }
  }

Attachments

Issue Links

blocks

SPARK-20901 Feature parity for ORC with Parquet

Open

links to

[Github] Pull Request #19744 (mpetruska)

[Github] Pull Request #19928 (dongjoon-hyun)

Activity

People

Assignee:: Dongjoon Hyun

Reporter:: Dongjoon Hyun

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 12/Oct/17 18:45

Updated:: 11/Dec/17 13:55

Resolved:: 11/Dec/17 13:55