[SPARK-2429] Hierarchical Implementation of KMeans - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Resolved
Priority: Major
Resolution: Duplicate
Affects Version/s: None
Fix Version/s: None
Component/s: MLlib
Labels:
- clustering

Description

Hierarchical clustering algorithms are widely used and would make a nice addition to MLlib. Clustering algorithms are useful for determining relationships between clusters as well as offering faster assignment. Discussion on the dev list suggested the following possible approaches:

Top down, recursive application of KMeans
Reuse DecisionTree implementation with different objective function
Hierarchical SVD

It was also suggested that support for distance metrics other than Euclidean such as negative dot or cosine are necessary.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

The Result of Benchmarking a Hierarchical Clustering.pdf
09/Oct/14 12:18
455 kB
Yu Ishikawa
benchmark2.html
20/Oct/14 08:45
477 kB
Yu Ishikawa
2014-10-20_divisive-hierarchical-clustering.pdf
20/Oct/14 15:29
244 kB
Yu Ishikawa
benchmark-result.2014-10-29.html
29/Oct/14 13:15
525 kB
Yu Ishikawa

Issue Links

duplicates

SPARK-6517 Bisecting k-means clustering

Resolved

is related to

SPARK-2966 Add an approximation algorithm for hierarchical clustering to MLlib

Closed

relates to

SPARK-2966 Add an approximation algorithm for hierarchical clustering to MLlib

Closed

links to

[Github] Pull Request #2906 (yu-iskw)

Activity

People

Assignee:: Yu Ishikawa

Reporter:: R J Nowling

Shepherd:: Xiangrui Meng

Votes:: 2 Vote for this issue

Watchers:: 15 Start watching this issue

Dates

Created:: 10/Jul/14 14:16

Updated:: 09/Nov/15 23:07

Resolved:: 09/Nov/15 23:07