Description
The for loop inside KMeansPlusPlusClusterer.chooseInitialClusters defines a variable
int sum = 0;
This variable should have type double, rather than int. Using an int causes the method to truncate the distances between points to (square roots of) integers. It's especially bad when the distances between points are typically less than 1.
As an aside, in version 2.2, this bug manifested itself by making the clusterer return empty clusters. I wonder if the EmptyClusterStrategy would still be necessary if this bug were fixed.