Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
3.1.0
-
None
-
None
Description
This issue aims to track the feature parity for ORC with Parquet.
Attachments
Issue Links
- is blocked by
-
SPARK-15705 Spark won't read ORC schema from metastore for partitioned tables
- Resolved
-
SPARK-22258 Writing empty dataset fails with ORC format
- Resolved
-
SPARK-25306 Avoid skewed filter trees to speed up `createFilter` in ORC
- Resolved
-
SPARK-14387 Enable Hive-1.x ORC compatibility with spark.sql.hive.convertMetastoreOrc
- Resolved
-
SPARK-15347 Problem select empty ORC table
- Resolved
-
SPARK-15474 ORC data source fails to write and read back empty dataframe
- Resolved
-
SPARK-15731 orc writer directory permissions
- Resolved
-
SPARK-15757 Error occurs when using Spark sql "select" statement on orc file after hive sql "insert overwrite tb1 select * from sourcTb" has been executed on this orc file
- Resolved
-
SPARK-16628 OrcConversions should not convert an ORC table represented by MetastoreRelation to HadoopFsRelation if metastore schema does not match schema stored in ORC files
- Resolved
-
SPARK-17047 Spark 2 cannot create table when CLUSTERED.
- Resolved
-
SPARK-18355 Spark SQL fails to read data from a ORC hive table that has a new column added to it
- Resolved
-
SPARK-19109 ORC metadata section can sometimes exceed protobuf message size limit
- Resolved
-
SPARK-19430 Cannot read external tables with VARCHAR columns if they're backed by ORC files written by Hive 1.2.1
- Resolved
-
SPARK-19809 NullPointerException on zero-size ORC file
- Resolved
-
SPARK-20515 Issue with reading Hive ORC tables having char/varchar columns in Spark SQL
- Resolved
-
SPARK-21422 Depend on Apache ORC 1.4.0
- Resolved
-
SPARK-21686 spark.sql.hive.convertMetastoreOrc is causing NullPointerException while reading ORC tables
- Resolved
-
SPARK-21762 FileFormatWriter/BasicWriteTaskStatsTracker metrics collection fails if a new file isn't yet visible
- Resolved
-
SPARK-21912 ORC/Parquet table should not create invalid column names
- Resolved
-
SPARK-21929 Support `ALTER TABLE table_name ADD COLUMNS(..)` for ORC data source
- Resolved
-
SPARK-22158 convertMetastore should not ignore storage properties
- Resolved
-
SPARK-22267 Spark SQL incorrectly reads ORC file when column order is different
- Resolved
-
SPARK-22279 Turn on spark.sql.hive.convertMetastoreOrc by default
- Resolved
-
SPARK-22300 Update ORC to 1.4.1
- Resolved
-
SPARK-22712 Use `buildReaderWithPartitionValues` in native OrcFileFormat
- Resolved
-
SPARK-23007 Add schema evolution test suite for file-based data sources
- Resolved
-
SPARK-23049 `spark.sql.files.ignoreCorruptFiles` should work for ORC files
- Resolved
-
SPARK-23340 Upgrade Apache ORC to 1.4.3
- Resolved
-
SPARK-23355 convertMetastore should not ignore table properties
- Resolved
-
SPARK-23399 Register a task completion listener first for OrcColumnarBatchReader
- Resolved
-
SPARK-24322 Upgrade Apache ORC to 1.4.4
- Resolved
-
SPARK-24472 Orc RecordReaderFactory throws IndexOutOfBoundsException
- Resolved
-
SPARK-25175 Field resolution should fail if there's ambiguity for ORC native reader
- Resolved
-
SPARK-25427 Add BloomFilter creation test cases
- Resolved
-
SPARK-25438 Fix FilterPushdownBenchmark to use the same memory assumption
- Resolved
-
SPARK-26427 Upgrade Apache ORC to 1.5.4
- Resolved
-
SPARK-26437 Decimal data becomes bigint to query, unable to query
- Resolved
-
SPARK-14286 Empty ORC table join throws exception
- Resolved
-
SPARK-22280 Improve StatisticsSuite to test `convertMetastore` properly
- Resolved
-
SPARK-22320 ORC should support VectorUDT/MatrixUDT
- Resolved
-
SPARK-25145 Buffer size too small on spark.sql query with filterPushdown predicate=True
- Resolved
-
SPARK-21791 ORC should support column names with dot
- Closed
-
SPARK-11412 Support merge schema for ORC
- Resolved
-
SPARK-16060 Vectorized ORC reader
- Resolved
-
SPARK-22781 Support creating streaming dataset with ORC files
- Resolved
-
SPARK-18540 Wholestage code-gen for ORC Hive tables
- Resolved
-
SPARK-20682 Add new ORCFileFormat based on Apache ORC
- Resolved
-
SPARK-20728 Make ORCFileFormat configurable between sql/hive and sql/core
- Resolved
-
SPARK-21787 Support for pushing down filters for DateType in native OrcFileFormat
- Resolved
-
SPARK-21839 Support SQL config for ORC compression
- Resolved
-
SPARK-23456 Turn on `native` ORC implementation by default
- Resolved
-
SPARK-24576 Upgrade Apache ORC to 1.5.2
- Resolved
-
SPARK-34562 Leverage parquet bloom filters
- Resolved
-
SPARK-12417 Orc bloom filter options are not propagated during file write in spark
- Resolved
-
SPARK-21783 Turn on ORC filter push-down by default
- Resolved
-
SPARK-23276 Enable UDT tests in (Hive)OrcHadoopFsRelationSuite
- Resolved
-
SPARK-23305 Test `spark.sql.files.ignoreMissingFiles` for all file-based data sources
- Resolved
-
SPARK-23452 Extend test coverage to all ORC readers
- Resolved
-
SPARK-24112 Add `spark.sql.hive.convertMetastoreTableProperty` for backward compatiblility
- Closed
-
SPARK-23072 Add a Unicode schema test for file-based data sources
- Resolved
-
SPARK-23342 Add ORC configuration tests for ORC data source
- Resolved
-
SPARK-23426 Use `hive` ORC impl and disable PPD for Spark 2.3.0
- Resolved
-
SPARK-22672 Refactor ORC Tests
- Resolved
-
SPARK-22416 Move OrcOptions from `sql/hive` to `sql/core`
- Resolved
-
SPARK-23313 Add a migration guide for ORC
- Resolved
- is related to
-
SPARK-19459 ORC tables cannot be read when they contain char/varchar columns
- Resolved
-
ORC-233 Allow `orc.include.columns` to be empty
- Closed
- relates to
-
HIVE-14007 Replace ORC module with ORC release
- Resolved