Description
ORC bloom filter is supported by the version of hive used in Spark 1.5.2. However, when trying to create orc file with bloom filter option, it does not make use of it.
E.g, following orc output does not create the bloom filter even though the options are specified.
Map<String, String> orcOption = new HashMap<String, String>(); orcOption.put("orc.bloom.filter.columns", "*"); hiveContext.sql("select * from accounts where effective_date='2015-12-30'").write(). format("orc").options(orcOption).save("/tmp/accounts");
Attachments
Attachments
Issue Links
- blocks
-
SPARK-20901 Feature parity for ORC with Parquet
- Open
- is related to
-
SPARK-25427 Add BloomFilter creation test cases
- Resolved
- relates to
-
ORC-137 Disable bloomfilter PPD for timestamps for files created before ORC-135
- Closed
-
ORC-101 Correct the use of the default charset in the bloomfilter
- Closed
- links to
User 'rajeshbalamohan' has created a pull request for this issue:
https://github.com/apache/spark/pull/10375