Details
-
Sub-task
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
Description
In arrow dataset, Filter pushdown can improve reading files performance greatly. We notice parquet has implemented, https://github.com/apache/arrow/blob/35b3567e73423420a99dbe6116f000e3c77d2a4c/cpp/src/arrow/dataset/file_parquet.cc#L465-L484.
But ORC fileformat has not supported Filter pushdown. It ignores the "filter" of ScanOptions now.