[SPARK-24043] InterpretedPredicate.eval fails if expression tree contains Nondeterministic expressions - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Minor
Resolution: Fixed
Affects Version/s: 2.3.0
Fix Version/s: 2.4.0
Component/s: SQL
Labels:
None

Description

When whole-stage codegen and predicate codegen both fail, FilterExec falls back to using InterpretedPredicate. If the predicate's expression contains any non-deterministic expressions, the evaluation throws an error:

scala> val df = Seq((1)).toDF("a")
df: org.apache.spark.sql.DataFrame = [a: int]

scala> df.filter('a > 0).show // this works fine
2018-04-21 20:39:26 WARN  FilterExec:66 - Codegen disabled for this expression:
 (value#1 > 0)
+---+
|  a|
+---+
|  1|
+---+

scala> df.filter('a > rand(7)).show // this will throw an error
2018-04-21 20:39:40 WARN  FilterExec:66 - Codegen disabled for this expression:
 (cast(value#1 as double) > rand(7))
2018-04-21 20:39:40 ERROR Executor:91 - Exception in task 0.0 in stage 1.0 (TID 1)
java.lang.IllegalArgumentException: requirement failed: Nondeterministic expression org.apache.spark.sql.catalyst.expressions.Rand should be initialized before eval.
	at scala.Predef$.require(Predef.scala:224)
	at org.apache.spark.sql.catalyst.expressions.Nondeterministic$class.eval(Expression.scala:326)
	at org.apache.spark.sql.catalyst.expressions.RDG.eval(randomExpressions.scala:34)

This is because no code initializes the Nondeterministic expressions before eval is called on them.

This is a low impact issue, since it would require both whole-stage codegen and predicate codegen to fail before FilterExec would fall back to using InterpretedPredicate.

Attachments

Issue Links

links to

[Github] Pull Request #21144 (bersprockets)

Activity

People

Assignee:: Bruce Robbins

Reporter:: Bruce Robbins

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 22/Apr/18 03:46

Updated:: 07/May/18 15:54

Resolved:: 07/May/18 15:54