Description
Currently, we are pushing filters down for normal Parquet reader which also filters record-by-record.
It seems Spark-side codegen row-by-row filtering might be faster than Parquet's one in general due to type-boxing and virtual function calls which Spark's one tries to avoid.
Maybe we should perform a benchmark and disable this. This ticket was from https://github.com/apache/spark/pull/14671
Please refer the discussion in the PR.