Details
-
Improvement
-
Status: Closed
-
Minor
-
Resolution: Fixed
-
3.0
-
None
-
None
Description
In determining whether the clusters have changed between iterations, the KMeansPlusPlusClusterer currently calls equals to determine whether the cluster centers have changed. It would be better to avoid relying on equals by instead checking whether any points have moved between clusters.
equals can be problematic because floating point operations are not strictly commutative or associative, so getCentroid may return slightly different values even when called with the same set of inputs. Additionally, the client may choose not to override equals at all, since it's not clear that it's required.