I've noticed that accumulators of Spark 1.x no longer work with Spark 2.x failing with
It happens while serializing an accumulator here
... although copyAndReset returns zero-value copy for sure, just consider the accumulator below
So, Spark treats zero value as non-zero due to how isZero is implemented in LegacyAccumulatorWrapper.
All this means that the values to be accumulated must implement equals and hashCode, otherwise isZero is very likely to always return false.
So I'm wondering whether the assertion
is really necessary and whether it can be safely removed from there?
If not - is it ok to just override writeReplace for LegacyAccumulatorWrapper to prevent such failures?