Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-6001

K-Means clusterer should return the assignments of input points to clusters

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 1.2.1
    • 1.5.0
    • MLlib
    • None

    Description

      The K-Means clusterer returns a KMeansModel that contains the cluster centers. However, when available, I suggest that the K-Means clusterer also return an RDD of the assignments of the input data to the clusters. While the assignments can be computed given the KMeansModel, why not return assignments if they are available to save re-computation costs.

      The K-means implementation at https://github.com/derrickburns/generalized-kmeans-clustering returns the assignments when available.

      Attachments

        Issue Links

          Activity

            People

              yuu.ishikawa@gmail.com Yu Ishikawa
              derrickburns Derrick Burns
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: