Description
Please excuse me if this issue was addressed already - I was unable to find it.
Calling .describe().show() on my dataframe results in a value of null for the row "mean":
val foo = spark.read.parquet("decimalNumbers.parquet") foo.select(col("numericvariable")).describe().show() foo: org.apache.spark.sql.DataFrame = [numericvariable: decimal(38,32)] +-------+--------------------+ |summary| numericvariable| +-------+--------------------+ | count| 299| | mean| null| | stddev| 0.2376438793946738| | min|0.037815489727642...| | max|2.138189366554511...|
But all of the rows for this seem ok (I can attache a parquet file). When I round the column, however, all is fine:
foo.select(bround(col("numericvariable"), 31)).describe().show() +-------+---------------------------+ |summary|bround(numericvariable, 31)| +-------+---------------------------+ | count| 299| | mean| 0.139522503183236...| | stddev| 0.2376438793946738| | min| 0.037815489727642...| | max| 2.138189366554511...| +-------+---------------------------+
Rounding using 32 gives null also though.