[SPARK-27283] BigDecimal arithmetic losing precision - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Minor
Resolution: Not A Bug
Affects Version/s: 2.4.0
Fix Version/s: None
Component/s: SQL
Labels:
- decimal
- float
- sql

Description

When performing arithmetics between doubles and decimals, the resulting value is always a double. This is very strange to me; when an exact type is present as one of the inputs, I would expect that the inexact type is lifted and the result presented exactly, rather than lowering the exact type to the inexact and presenting a result that may contain rounding errors. The choice to use a decimal was probably taken because rounding errors were deemed an issue.

When performing arithmetics between decimals and integers, the expected behaviour is seen; the result is a decimal.

See the following example:

import org.apache.spark.sql.functions
val df = sparkSession.createDataFrame(Seq(Tuple1(0L))).toDF("a")

val decimalInt = df.select(functions.lit(BigDecimal(3.14)) + functions.lit(1) as "d")
val decimalDouble = df.select(functions.lit(BigDecimal(3.14)) + functions.lit(1.0) as "d")

decimalInt.schema.printTreeString()
decimalInt.show()
decimalDouble.schema.printTreeString()
decimalDouble.show()

which produces this output (with possible variation on the rounding error):

root
|-- d: decimal(4,2) (nullable = true)

+----+
| d  |
+----+
|4.14|
+----+

root
|-- d: double (nullable = false)

+-----------------+
| d               |
+-----------------+
|4.140000000000001|
+-----------------+

I would argue that this is a bug, and that the correct thing to do would be to lift the result to a decimal also when one operand is a double.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Mats

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 26/Mar/19 09:47

Updated:: 28/Mar/19 12:59

Resolved:: 28/Mar/19 12:59