Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
4.0.0
Description
SQL failures already provide nice error context when there is a failure:
org.apache.spark.SparkArithmeticException: [DIVIDE_BY_ZERO] Division by zero. Use `try_divide` to tolerate divisor being 0 and return NULL instead. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error. == SQL(line 1, position 1) == a / b ^^^^^ at org.apache.spark.sql.errors.QueryExecutionErrors$.divideByZeroError(QueryExecutionErrors.scala:201) at org.apache.spark.sql.errors.QueryExecutionErrors.divideByZeroError(QueryExecutionErrors.scala) ...
We could add a similar user friendly error context to Dataset APIs.
E.g. consider the following Spark app SimpleApp.scala:
1 import org.apache.spark.sql.SparkSession 2 import org.apache.spark.sql.functions._ 3 4 object SimpleApp { 5 def main(args: Array[String]) { 6 val spark = SparkSession.builder.appName("Simple Application").config("spark.sql.ansi.enabled", true).getOrCreate() 7 import spark.implicits._ 8 9 val c = col("a") / col("b") 10 11 Seq((1, 0)).toDF("a", "b").select(c).show() 12 13 spark.stop() 14 } 15 }
then the error context could be:
Exception in thread "main" org.apache.spark.SparkArithmeticException: [DIVIDE_BY_ZERO] Division by zero. Use `try_divide` to tolerate divisor being 0 and return NULL instead. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error. == Dataset == "div" was called from SimpleApp$.main(SimpleApp.scala:9) at org.apache.spark.sql.errors.QueryExecutionErrors$.divideByZeroError(QueryExecutionErrors.scala:201) at org.apache.spark.sql.catalyst.expressions.DivModLike.eval(arithmetic.scala:672 ...
Attachments
Issue Links
- is related to
-
SPARK-45805 Eliminate magic numbers in withOrigin
- Resolved
-
SPARK-45826 Add a SQL config for extra stack traces in Origin
- Resolved
- links to
(1 links to)