This bug was partially addressed in
SPARK-18555, but the root cause isn't completely solved. This bug is pretty critical since it changes the member id in Long in our application if the member id can not be represented by Double losslessly when the member id is very big.
Here is an example how this happens, with
the logical plan will be
Note that even the value is not null, Spark will cast the Long into Double first. Then if it's not null, Spark will cast it back to Long which results in losing precision.
The behavior should be that the original value should not be changed if it's not null, but Spark will change the value which is wrong.
With the PR, the logical plan will be
which behaves correctly without changing the original Long values.