Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Incomplete
-
2.3.1
-
None
-
None
-
spark on yarn 2.3.1
Description
My SQL query uses two tables to join. One table join key has null value. I use random value instead of null;however, it has error. The error as follows:
Error in query: nondeterministic expressions are only allowed in Project, Filter, Aggregate or Window, found scan spark source code is org.apache.spark.sql.catalyst.analysis.CheckAnalysis check sql, because the number of random variables is uncertain, it is prohibited
case o if o.expressions.exists(!_.deterministic) && !o.isInstanceOf[Project] && !o.isInstanceOf[Filter] && !o.isInstanceOf[Aggregate] && !o.isInstanceOf[Window] => // The rule above is used to check Aggregate operator. failAnalysis( s"""nondeterministic expressions are only allowed in |Project, Filter, Aggregate or Window, found:| |${o.expressions.map(_.sql).mkString(",")}| |in operator ${operator.simpleString} """.stripMargin)|
Is it possible to add Join to this code? It's not yet tested.And whether there will be other effects
case o if o.expressions.exists(!_.deterministic) && !o.isInstanceOf[Project] && !o.isInstanceOf[Filter] && !o.isInstanceOf[Aggregate] && !o.isInstanceOf[Window] +{color:#d04437}&& !o.isInstanceOf[Join]{color}+ => // The rule above is used to check Aggregate operator. failAnalysis( s"""nondeterministic expressions are only allowed in |Project, Filter, Aggregate or Window or Join, found:| |${o.expressions.map(_.sql).mkString(",")}| |in operator ${operator.simpleString} """.stripMargin)|
This is my SQL query is as below:
SELECT T1.CUST_NO AS CUST_NO, T3.CON_LAST_NAME AS CUST_NAME, T3.CON_SEX_MF AS SEX_CODE, T3.X_POSITION AS POST_LV_CODE FROM tmp.ICT_CUST_RANGE_INFO T1 LEFT JOIN tmp.F_CUST_BASE_INFO_ALL T3 ON CASE WHEN coalesce(T1.CUST_NO, '') ='' THEN concat('cust_no', RAND()) ELSE T1.CUST_NO END = T3.BECIF AND T3.DATE='20181105' WHERE T1.DATE='20181105'