Details
-
Bug
-
Status: Resolved
-
Blocker
-
Resolution: Fixed
-
Impala 2.5.0
Description
Per-row filters currently can not evaluate non-slotref expressions, but the frontend generates them. In some cases, the parquet scanner does not correctly ignore those filters.
The best fix is to evaluate row filters once a row has been materialized. This concentrates filter processing per row (better cache locality), and ensures that a) all slots are available for filter expr evaluation and b) there is a materialized Tuple available which means all exprs can be evaluated.
Thankfully the changes required to do this are very simple - just move logic from the column reader to HdfsParquetScanner::ReadRow().