[SYSTEMDS-1238] Python test failing for LinearRegCG - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: SystemML 0.13
Fix Version/s: SystemML 0.13
Component/s: Algorithms, APIs
Labels:
None

Description

deron discovered that the one of the python test (test_mllearn_df.py) with spark 2.1.0 was failing because the test score from linear regression was very low (~ 0.24). I did a some investigation and it turns out the the model parameters computed by the dml script are incorrect. In systemml.12, the values of betas from linear regression model are [152.919, 938.237]. This is what we expect from normal equation. (I also tested this with sklearn). But the values of betas from systemml.13 (with spark 2.1.0) come out to be [153.146, 458.489]. These are not correct and therefore the test score is much lower than expected. The data going into DML script is correct. I printed out the valued of X and Y in dml and I didn't see any issue there.

Attached are the log files for two different tests (systemml0.12 and 0.13) with explain flag.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

python_LinearReg_test_spark.1.6.log
08/Feb/17 02:53
505 kB
Imran Younus
python_LinearReg_test_spark.2.1.log
08/Feb/17 02:53
526 kB
Imran Younus

Activity

People

Assignee:: Niketan Pansare

Reporter:: Imran Younus

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 08/Feb/17 02:53

Updated:: 18/Feb/17 00:06

Resolved:: 17/Feb/17 23:03