Uploaded image for project: 'Commons Math'
  1. Commons Math
  2. MATH-584

KMeansPlusPlusClusterer incorrectly selects initial cluster centers and is unnecessarily slow

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.2
    • Fix Version/s: 3.0
    • Environment:

      All environments

      Description

      The chooseInitialClusters() method declares sum as an int, when it should be double. It also is quite slow because it contains a lot of unnecessary computation. I'll attached a patch which corrects the problems.

      I found the problems while comparing an optimized implementation of KMeans++ I've been working on with the one in commons math.

        Attachments

        1. kmeans_plus_plus.patch
          6 kB
          Randall Scarberry
        2. kmeans_plus_plus.patch
          6 kB
          Randall Scarberry

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              drrandys Randall Scarberry
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: