Details
Description
Parquet format PR #17 standardized representation of Parquet complex types and listed backwards-compatibility rules. Spark SQL should implement these compatibility rules to improve interoperatability.
Before, Spark SQL is only compatible with parquet-avro, parquet-hive, and Impala. And it's done in an error prone ad-hoc way, because Parquet format spec didn't explicitly specify complex type structures at the time Spark SQL Parquet support was firstly authored. After fixing this issue, we are expected to be compatible with most (if not all) systems that generated Parquet data in a systematic way by conforming to Parquet format spec and implementing all backwards-compatibility rules.
Attachments
Issue Links
- is duplicated by
-
SPARK-5508 Arrays and Maps stored with Hive Parquet Serde may not be able to read by the Parquet support in the Data Souce API
- Resolved
-
SPARK-8811 Read array struct data from parquet error
- Resolved