Commons Math
  1. Commons Math
  2. MATH-1031

Refactoring: Move variance calculation of a centroid cluster to its class


    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 3.2
    • Fix Version/s: 3.3
    • Labels:


      Users might be interested in assessing the quality of each cluster in the calculated clustering. This can be performed by calculating its variance.
      The variance calculation is actually performed in other places (e.g. for the MultiKMeans), but not available to end users.
      I'd propose to add the functionality into the CentroidCluster. The one issue to consider is that the cluster does not know based on which distance measure it was calculated. In the implementation, I chose to parametrize the method with a distance measure which enables users to also compare the quality based on various distance measures. Alternatively, it would be possible to add the distance measure as a field, which is set by the clustering algorithm.
      In the patch I went for the first method and also changed the 2 other places where variance calculation is performed to use the new feature.

      1. centroid.patch
        3 kB
        Thorsten Schäfer


        Thorsten Schäfer created issue -
        Thorsten Schäfer made changes -
        Field Original Value New Value
        Attachment centroid.patch [ 12601004 ]
        Thomas Neidhart made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Fix Version/s 3.3 [ 12324600 ]
        Resolution Fixed [ 1 ]
        Luc Maisonobe made changes -
        Status Resolved [ 5 ] Closed [ 6 ]


          • Assignee:
            Thorsten Schäfer
          • Votes:
            0 Vote for this issue
            2 Start watching this issue


            • Created: