Description
This issue aims to upgrade Apache ORC library from 1.4.4 to 1.5.1 in order to bring the following benefits into Apache Spark.
ORC-91Support for variable length blocks in HDFS (The current space wasted in ORC to padding is known to be 5%.)ORC-344Support for using Decimal64ColumnVector
In addition to that, Apache Hive 3.1.0 and 3.2.0 will use ORC 1.5.1 (HIVE-19669) and 1.5.2 (HIVE-19792) respectively. This will improve the compatibility between Apache Spark and Apache Hive.
Attachments
Issue Links
- blocks
-
SPARK-20901 Feature parity for ORC with Parquet
- Open
- links to