Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Incomplete
-
1.3.0
-
None
Description
KMeans uses accumulators to compute the cost of a clustering at each iteration.
Each time a ShuffleMapTask completes, it increments the accumulators at the driver. If a task runs twice because of failures, the accumulators get incremented twice.
KMeans uses accumulators in ShuffleMapTasks. This means that a task's cost can end up being double-counted.
Attachments
Issue Links
- relates to
-
SPARK-732 Recomputation of RDDs may result in duplicated accumulator updates
- Closed