[SPARK-12685] word2vec trainWordsCount gets overflow - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Minor
Resolution: Fixed
Affects Version/s: 1.6.0
Fix Version/s: 1.4.2, 1.5.3, 1.6.1, 2.0.0
Component/s: MLlib
Labels:
None

Target Version/s:

1.4.2, 1.5.3, 1.6.1, 2.0.0

Description

the log of word2vec reports
trainWordsCount = -785727483
during computation over a large dataset.

I'll also add vocabsize to the log.

Update the priority as it will affects the computation process.
alpha =
learningRate * (1 - numPartitions * wordCount.toDouble / (trainWordsCount + 1))

Attachments

Issue Links

links to

[Github] Pull Request #10627 (hhbyyh)

[Github] Pull Request #10721 (hhbyyh)

Activity

People

Assignee:: yuhao yang

Reporter:: yuhao yang

Shepherd:: Joseph K. Bradley

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 07/Jan/16 04:28

Updated:: 13/Jan/16 19:54

Resolved:: 13/Jan/16 19:54