[SPARK-33593] Vector reader got incorrect data with binary partition value - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Blocker
Resolution: Fixed
Affects Version/s: 2.0.2, 2.1.3, 2.2.3, 2.3.4, 2.4.7, 3.0.1, 3.1.0, 3.2.0
Fix Version/s: 2.4.8, 3.0.2, 3.1.0
Component/s: SQL
Labels:
- correctness

Target Version/s:

2.4.8, 3.0.2, 3.1.0

Description

test("Parquet vector reader incorrect with binary partition value") {
  Seq(false, true).foreach(tag => {
    withSQLConf("spark.sql.parquet.enableVectorizedReader" -> tag.toString) {
      withTable("t1") {
        sql(
          """CREATE TABLE t1(name STRING, id BINARY, part BINARY)
            | USING PARQUET PARTITIONED BY (part)""".stripMargin)
        sql(s"INSERT INTO t1 PARTITION(part = 'Spark SQL') VALUES('a', X'537061726B2053514C')")
        if (tag) {
          checkAnswer(sql("SELECT name, cast(id as string), cast(part as string) FROM t1"),
            Row("a", "Spark SQL", ""))
        } else {
          checkAnswer(sql("SELECT name, cast(id as string), cast(part as string) FROM t1"),
            Row("a", "Spark SQL", "Spark SQL"))
        }
      }
    }
  })
}

Attachments

Issue Links

links to

[Github] Pull Request #30824 (AngersZhuuuu)

[Github] Pull Request #30839 (AngersZhuuuu)

[Github] Pull Request #30840 (AngersZhuuuu)

Activity

People

Assignee:: angerszhu

Reporter:: angerszhu

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 30/Nov/20 03:06

Updated:: 21/Dec/20 11:28

Resolved:: 18/Dec/20 08:03