Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
Description
The parquet spec includes bloom filters which can be useful during filtration. In the context of dataset::, this would be expressed as additional parquet statistics expressions on each row group, allowing entirely-excluded row groups to be skipped more aggressively.
Prerequisite: https://issues.apache.org/jira/browse/PARQUET-1327 (reader/writer support for bloom filters)