Description
Currently non-deterministic filters can be pushed down to V2 file sources, which is different from V1 which prevents out non-deterministic filters from being pushed.
Main consequences:
- Things like doing a rand filter on a partition column will throw an exception:
- IllegalArgumentException: requirement failed: Nondeterministic expression org.apache.spark.sql.catalyst.expressions.Rand should be initialized before eval.
- Using a non-deterministic UDF to collect metrics via accumulators gets pushed down and gives the wrong metrics