[SPARK-6137] G-Means clustering algorithm implementation - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Resolved
Priority: Minor
Resolution: Won't Fix
Affects Version/s: None
Fix Version/s: None
Component/s: MLlib
Labels:
- clustering

Description

Will it be useful to implement G-Means clustering algorithm based on K-Means?
G-means is a powerful extension of k-means, which uses test of cluster data normality to decide if it necessary to split current cluster into new two. It's relative complexity (compared to k-Means) is O(K), where K is maximum number of clusters.

The original paper is by Greg Hamerly and Charles Elkan from University of California:
http://papers.nips.cc/paper/2526-learning-the-k-in-k-means.pdf

I also have a small prototype of this algorithm written in R (if anyone is interested in it).

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Denis Dus

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Dates

Created:: 03/Mar/15 09:54

Updated:: 21/Jan/16 16:26

Resolved:: 21/Jan/16 16:26