Uploaded image for project: 'Commons Math'
  1. Commons Math
  2. MATH-546

Truncation issue in KMeansPlusPlusClusterer

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 3.0
    • Fix Version/s: 3.0
    • Labels:

      Description

      The for loop inside KMeansPlusPlusClusterer.chooseInitialClusters defines a variable
      int sum = 0;
      This variable should have type double, rather than int. Using an int causes the method to truncate the distances between points to (square roots of) integers. It's especially bad when the distances between points are typically less than 1.

      As an aside, in version 2.2, this bug manifested itself by making the clusterer return empty clusters. I wonder if the EmptyClusterStrategy would still be necessary if this bug were fixed.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              npaymer Nate Paymer
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: