[SPARK-2308] Add KMeans MiniBatch clustering algorithm to MLlib - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Resolved
Priority: Minor
Resolution: Won't Fix
Affects Version/s: None
Fix Version/s: None
Component/s: MLlib
Labels:
- clustering

Description

Mini-batch is a version of KMeans that uses a randomly-sampled subset of the data points in each iteration instead of the full set of data points, improving performance (and in some cases, accuracy). The mini-batch version is compatible with the KMeans|| initialization algorithm currently implemented in MLlib.

I suggest adding KMeans Mini-batch as an alternative.

I'd like this to be assigned to me.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

many_small_centers.pdf
16/Jul/14 15:12
20 kB
R J Nowling
uneven_centers.pdf
16/Jul/14 15:12
15 kB
R J Nowling

Issue Links

is duplicated by

SPARK-14174 Implement the Mini-Batch KMeans

Resolved

links to

[Github] Pull Request #1248 (rnowling)

Activity

People

Assignee:: R J Nowling

Reporter:: R J Nowling

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Dates

Created:: 27/Jun/14 18:57

Updated:: 01/Apr/16 15:05

Resolved:: 02/Mar/15 23:00