Details
-
New Feature
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
Parquet-mr 1.12.0 will add support for reading and writing Bloom filters.
Reading doesn't need any action from Hive side, as it will be applied automatically if there is an Eq predicate on a column and the file contains a bloom filter.
Writing needs some configuration, as Parquet-mr doesn't write bloom filters by default.
Similar table properties could be used as in ORC, e.g. 'orc.bloom.filter.columns' to set the columns where Parquet-mr should write bloom filters. The same table property could be used by both Hive and Impala for the same purpose.
Attachments
Issue Links
- is related to
-
SPARK-34562 Leverage parquet bloom filters
- Resolved
- relates to
-
HIVE-9260 Implement the bloom filter for the ParquetSerde
- Open
-
IMPALA-9470 Use Parquet bloom filters
- Resolved
-
PARQUET-41 Add bloom filters to parquet statistics
- Resolved