[SPARK-5962] [MLLIB] Python support for Power Iteration Clustering - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 1.5.0
Component/s: MLlib
Labels:
- python

Target Version/s:

1.5.0

Description

Add python support for the Power Iteration Clustering feature. Here is a fragment of the python API as we plan to implement it:

/**

Java stub for Python mllib PowerIterationClustering.run()
*/
def trainPowerIterationClusteringModel(
data: JavaRDD[(java.lang.Long, java.lang.Long, java.lang.Double)],
k: Int,
maxIterations: Int,
runs: Int,
initializationMode: String,
seed: java.lang.Long): PowerIterationClusteringModel = {
val picAlg = new PowerIterationClustering()
.setK(k)
.setMaxIterations(maxIterations)

try

{ picAlg.run(data.rdd.persist(StorageLevel.MEMORY_AND_DISK)) }

finally

{ data.rdd.unpersist(blocking = false) }

}

Attachments

Issue Links

duplicates

SPARK-5963 [MLLIB] Python support for Power Iteration Clustering

Closed

Is contained by

SPARK-7536 Audit MLlib Python API for 1.4

Resolved

is duplicated by

SPARK-6260 Python API for PowerIterationClustering

Closed

is related to

SPARK-7541 Check model save/load for MLlib 1.4

Resolved

relates to

SPARK-6254 MLlib Python API parity check at 1.3 release

Closed

links to

[Github] Pull Request #6992 (yanboliang)

(1 links to)

Activity

People

Assignee:: Yanbo Liang

Reporter:: Stephen Boesch

Shepherd:: Xiangrui Meng

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 24/Feb/15 02:15

Updated:: 29/Jun/15 05:38

Resolved:: 29/Jun/15 05:38

Time Tracking

Estimated:

168h

Remaining:

168h

Logged:

Not Specified