Currently, KMeans assumes the only possible distance measure to be used is the Euclidean.
In some use cases, eg. text mining, other distance measures like the cosine distance are widely used. Thus, for such use cases, it would be good to support multiple distance measures.
This ticket is to support the cosine distance measure on KMeans. Later, other algorithms can be extended to support several distance measures and other distance measures can be added.
- is related to
SPARK-23217 Add cosine distance measure to ClusteringEvaluator
- links to