Uploaded image for project: 'Commons Math'
  1. Commons Math
  2. MATH-546

Truncation issue in KMeansPlusPlusClusterer

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 3.0
    • 3.0
    • None

    Description

      The for loop inside KMeansPlusPlusClusterer.chooseInitialClusters defines a variable
      int sum = 0;
      This variable should have type double, rather than int. Using an int causes the method to truncate the distances between points to (square roots of) integers. It's especially bad when the distances between points are typically less than 1.

      As an aside, in version 2.2, this bug manifested itself by making the clusterer return empty clusters. I wonder if the EmptyClusterStrategy would still be necessary if this bug were fixed.

      Attachments

        1. MATH-546.txt
          5 kB
          Nate Paymer

        Activity

          People

            Unassigned Unassigned
            npaymer Nate Paymer
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: