Details
Description
Method org.apache.spark.sql.execution.datasources.DataSourceStrategy#selectFilters, which is used to determine "pushdown-able" filters, does not preserve the order of the input Seq[Expression] nor does it return the same order across the same plans (modulo ExprId differences). This is resulting in CodeGenerator cache misses even when the exact same LogicalPlan is executed.
The aforementioned method does not attempt to maintain the order of the input predicates, though it happens to do so when there are less than 5 pushdown-able Expression in the input (due to some "small maps" logic in scala.collection.TraversableOnce#toMap).
Returning in the same order as the input will reduce churn on the CodeGenerator cache under prolonged workloads that execute queries that are very similar.