Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
2.4.0
-
None
Description
A few SQL-related tests fail in Scala 2.12, such as UDFSuite's "SPARK-24891 Fix HandleNullInputsForUDF rule":
- SPARK-24891 Fix HandleNullInputsForUDF rule *** FAILED *** Results do not match for query: ... == Results == == Results == !== Correct Answer - 3 == == Spark Answer - 3 == !struct<> struct<a:bigint,b:int,c:int> ![0,10,null] [0,10,0] ![1,12,null] [1,12,1] ![2,14,null] [2,14,2] (QueryTest.scala:163)
You can kind of get what's going on reading the test:
test("SPARK-24891 Fix HandleNullInputsForUDF rule") { // assume(!ClosureCleanerSuite2.supportsLMFs) // This test won't test what it intends to in 2.12, as lambda metafactory closures // have arg types that are not primitive, but Object val udf1 = udf({(x: Int, y: Int) => x + y}) val df = spark.range(0, 3).toDF("a") .withColumn("b", udf1($"a", udf1($"a", lit(10)))) .withColumn("c", udf1($"a", lit(null))) val plan = spark.sessionState.executePlan(df.logicalPlan).analyzed comparePlans(df.logicalPlan, plan) checkAnswer( df, Seq( Row(0, 10, null), Row(1, 12, null), Row(2, 14, null))) }
It seems that the closure that is fed in as a UDF changes behavior, in a way that primitive-type arguments are handled differently. For example an Int argument, when fed 'null', acts like 0.
I'm sure it's a difference in the LMF closure and how its types are understood, but not exactly sure of the cause yet.
Attachments
Issue Links
- is cloned by
-
SPARK-25690 Analyzer rule "HandleNullInputsForUDF" does not stabilize and can be applied infinitely
- Resolved
- links to