[SPARK-3097] Word2Vec Performance Improvement - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 1.1.0
Fix Version/s: 1.1.0
Component/s: MLlib
Labels:
None

Description

For each partition, the output model only contains words in that partition and use reduceByKey to combine models in different partition to reduce shuffle write and improve performance.

Attachments

Activity

People

Assignee:: Liquan Pei

Reporter:: Liquan Pei

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 18/Aug/14 06:25

Updated:: 18/Aug/14 06:30

Resolved:: 18/Aug/14 06:30