Details
-
New Feature
-
Status: Resolved
-
Major
-
Resolution: Duplicate
-
None
-
None
Description
Hierarchical clustering algorithms are widely used and would make a nice addition to MLlib. Clustering algorithms are useful for determining relationships between clusters as well as offering faster assignment. Discussion on the dev list suggested the following possible approaches:
- Top down, recursive application of KMeans
- Reuse DecisionTree implementation with different objective function
- Hierarchical SVD
It was also suggested that support for distance metrics other than Euclidean such as negative dot or cosine are necessary.
Attachments
Attachments
Issue Links
- duplicates
-
SPARK-6517 Bisecting k-means clustering
- Resolved
- is related to
-
SPARK-2966 Add an approximation algorithm for hierarchical clustering to MLlib
- Closed
- relates to
-
SPARK-2966 Add an approximation algorithm for hierarchical clustering to MLlib
- Closed
- links to