Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
Description
We can build upon the API defined in fastparquet for defining RowGroup filters: https://github.com/dask/fastparquet/blob/master/fastparquet/api.py#L296-L300 and translate them into the C++ enums we will define in https://issues.apache.org/jira/browse/PARQUET-1158 . This should enable us to provide the user with a simple predicate pushdown API that we can extend in the background from RowGroup to Page level later on.
Attachments
Issue Links
- depends upon
-
ARROW-8039 [Python][Dataset] Support using dataset API in pyarrow.parquet with a minimal ParquetDataset shim
- Resolved
- is related to
-
PARQUET-1158 [C++] Basic RowGroup filtering
- Open
- links to