Uploaded image for project: 'Mahout'
  1. Mahout
  2. MAHOUT-1959

BallKMeans.iterativeAssignment can set wrong weights.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Patch Available
    • Major
    • Resolution: Unresolved
    • None
    • classic-15.0
    • None
    • None

    Description

      I notice that the BallKMeans.iterativeAssignment method uses the following code to calculate weights:

      BallKMeans.java
      for (WeightedVector datapoint : datapoints) {
              Centroid closestCentroid = (Centroid) centroids.searchFirst(datapoint, false).getValue();
              closestCentroid.setWeight(closestCentroid.getWeight() + datapoint.getWeight());
            }
      

      In MAHOUT-1237, the buggy code is the same way to calculate the weight:

      ClusteringUtils.java
      for (Vector vector : datapoints) {
            Centroid closest = (Centroid) centroids.searchFirst(vector, false).getValue();
            totalCost += closest.getWeight();
          }
      

      The fixed code is as follow:

      ClusteringUtils.java
      for (Vector vector : datapoints) {
            totalCost += centroids.searchFirst(vector, false).getWeight();
          }
      

      I am not quite sure whether BallKMeans.iterativeAssignment sets the right weights. Please check it.

      Attachments

        Activity

          People

            balakuntala Shashanka Balakuntala Srinivasa
            haozhong Hao Zhong
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: