Description
When performing arithmetics between doubles and decimals, the resulting value is always a double. This is very strange to me; when an exact type is present as one of the inputs, I would expect that the inexact type is lifted and the result presented exactly, rather than lowering the exact type to the inexact and presenting a result that may contain rounding errors. The choice to use a decimal was probably taken because rounding errors were deemed an issue.
When performing arithmetics between decimals and integers, the expected behaviour is seen; the result is a decimal.
See the following example:
import org.apache.spark.sql.functions val df = sparkSession.createDataFrame(Seq(Tuple1(0L))).toDF("a") val decimalInt = df.select(functions.lit(BigDecimal(3.14)) + functions.lit(1) as "d") val decimalDouble = df.select(functions.lit(BigDecimal(3.14)) + functions.lit(1.0) as "d") decimalInt.schema.printTreeString() decimalInt.show() decimalDouble.schema.printTreeString() decimalDouble.show()
which produces this output (with possible variation on the rounding error):
root |-- d: decimal(4,2) (nullable = true) +----+ | d | +----+ |4.14| +----+ root |-- d: double (nullable = false) +-----------------+ | d | +-----------------+ |4.140000000000001| +-----------------+
I would argue that this is a bug, and that the correct thing to do would be to lift the result to a decimal also when one operand is a double.