Details
Description
Partitioning discovery will fail with the following case
test("_SUCCESS should not break partitioning discovery") { withTempPath { dir => val tablePath = new File(dir, "table") val df = (1 to 3).map(i => (i, i, i, i)).toDF("a", "b", "c", "d") df.write .format("parquet") .partitionBy("b", "c", "d") .save(tablePath.getCanonicalPath) Files.touch(new File(s"${tablePath.getCanonicalPath}/b=1", "_SUCCESS")) Files.touch(new File(s"${tablePath.getCanonicalPath}/b=1/c=1", "_SUCCESS")) Files.touch(new File(s"${tablePath.getCanonicalPath}/b=1/c=1/d=1", "_SUCCESS")) checkAnswer(sqlContext.read.format("parquet").load(tablePath.getCanonicalPath), df) } }
Because _SUCCESS is the in the inner partitioning dirs, partitioning discovery will fail.
Attachments
Issue Links
- relates to
-
SPARK-15895 _common_metadata and _metadata appearing in the inner partitioning dirs of a partitioned parquet datasets break partitioning discovery
- Resolved
- links to