Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-3219

K-Means clusterer should support Bregman distance functions

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Won't Fix
    • None
    • None
    • MLlib

    Description

      The K-Means clusterer supports the Euclidean distance metric. However, it is rather straightforward to support Bregman (http://machinelearning.wustl.edu/mlpapers/paper_files/BanerjeeMDG05.pdf) distance functions which would increase the utility of the clusterer tremendously.

      I have modified the clusterer to support pluggable distance functions. However, I notice that there are hundreds of outstanding pull requests. If someone is willing to work with me to sponsor the work through the process, I will create a pull request. Otherwise, I will just keep my own fork.

      Attachments

        Issue Links

          Activity

            People

              derrickburns Derrick Burns
              derrickburns Derrick Burns
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: