Description
When whole-stage codegen and predicate codegen both fail, FilterExec falls back to using InterpretedPredicate. If the predicate's expression contains any non-deterministic expressions, the evaluation throws an error:
scala> val df = Seq((1)).toDF("a") df: org.apache.spark.sql.DataFrame = [a: int] scala> df.filter('a > 0).show // this works fine 2018-04-21 20:39:26 WARN FilterExec:66 - Codegen disabled for this expression: (value#1 > 0) +---+ | a| +---+ | 1| +---+ scala> df.filter('a > rand(7)).show // this will throw an error 2018-04-21 20:39:40 WARN FilterExec:66 - Codegen disabled for this expression: (cast(value#1 as double) > rand(7)) 2018-04-21 20:39:40 ERROR Executor:91 - Exception in task 0.0 in stage 1.0 (TID 1) java.lang.IllegalArgumentException: requirement failed: Nondeterministic expression org.apache.spark.sql.catalyst.expressions.Rand should be initialized before eval. at scala.Predef$.require(Predef.scala:224) at org.apache.spark.sql.catalyst.expressions.Nondeterministic$class.eval(Expression.scala:326) at org.apache.spark.sql.catalyst.expressions.RDG.eval(randomExpressions.scala:34)
This is because no code initializes the Nondeterministic expressions before eval is called on them.
This is a low impact issue, since it would require both whole-stage codegen and predicate codegen to fail before FilterExec would fall back to using InterpretedPredicate.