Uploaded image for project: 'Commons Math'
  1. Commons Math
  2. MATH-548

KMeansPlusPlusClusterer should run multiple trials

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:
      None

      Description

      The interface and documentation for KMeansPlusPlusClusterer imply that a single call to cluster() is sufficient to get the optimal set of clusters. But this isn't true – practically every client should be calling cluster() multiple times, selecting the best resulting set of clusters. It seems to me that rather than forcing every client to implement this functionality, it should be placed directly in the KMeansPlusPlusClusterer class.

      I propose adding a new method to KMeansPlusPlusClusterer:
      List<Cluster<T>> cluster(Collection<T> points, int k, int numTrials, int maxIterationsPerTrial)
      which calls the existing cluster() method numTrials times, returning the best result.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              npaymer Nate Paymer
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: