Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.0.2
    • Fix Version/s: 1.3.0
    • Component/s: MLlib
    • Labels:
      None

      Description

      The KMeansPlusPlus algorithm is implemented in time O( m k^2), where m is the rounds of the KMeansParallel algorithm and k is the number of clusters.

      This can be dramatically improved by maintaining the distance the closest cluster center from round to round and then incrementally updating that value for each point. This incremental update is O(1) time, this reduces the running time for K Means Plus Plus to O( m k ). For large k, this is significant.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                derrickburns Derrick Burns
                Reporter:
                derrickburns Derrick Burns
              • Votes:
                0 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: