[SPARK-14153] My dataset does not provide proper predictions in ALS - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Question
Status: Resolved
Priority: Major
Resolution: Duplicate
Affects Version/s: None
Fix Version/s: None
Component/s: Java API, ML
Labels:
None

Description

When I used data-set in the git-hub example, I get proper predictions. But when I used my data set It does not predict well. (I has a large RMSE).
I used cross validator for ALS (in Spark ML) and here are the best model parameters.

16/03/25 12:03:06 INFO CrossValidator: Average cross-validation metrics: WrappedArray(NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN)
16/03/25 12:03:06 INFO CrossValidator: Best set of parameters:
{
als_c911c0e183a3-alpha: 0.02,
als_c911c0e183a3-rank: 500,
als_c911c0e183a3-regParam: 0.03
}

But when I used movie data set It gives proper values for parameters. as below
16/03/24 14:07:07 INFO CrossValidator: Average cross-validation metrics: WrappedArray(1.9481584447713676, 2.0501457159728944, 2.0600857505406935, 1.9457234533860048, 2.0494498583414282, 2.0595306613827002, 1.9488322049918922, 2.0489573853226797, 2.0584252131752, 1.9464006741621391, 2.048241271354197, 2.057853990227443)
16/03/24 14:07:07 INFO CrossValidator: Best set of parameters:
{
als_31a605e7717b-alpha: 0.02,
als_31a605e7717b-rank: 1,
als_31a605e7717b-regParam: 0.02
}
16/03/24 14:07:07 INFO CrossValidator: Best cross-validation metric: 1.9457234533860048.

Attachments

Issue Links

duplicates

SPARK-14489 RegressionEvaluator returns NaN for ALS in Spark ml

Resolved

links to

[Github] Pull Request #12577 (MLnick)

Activity

People

Assignee:: Unassigned

Reporter:: Dulaj Rajitha

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 25/Mar/16 06:42

Updated:: 21/Apr/16 14:47

Resolved:: 08/Apr/16 11:55