Description
MLlib does not have any general purposed clustering metrics with a ground truth.
In [Scikit-Learn](http://scikit-learn.org/stable/modules/classes.html#clustering-metrics), there are several kinds of metrics for this.
It may be meaningful to add some clustering metrics into MLlib.
This should be added as a ClusteringEvaluator class of extending Evaluator in spark.ml.
Attachments
Issue Links
- is related to
-
SPARK-22440 Add Calinski-Harabasz index to ClusteringEvaluator
- Resolved
-
SPARK-23217 Add cosine distance measure to ClusteringEvaluator
- Resolved
- relates to
-
SPARK-21981 Python API for ClusteringEvaluator
- Resolved
- links to