Uploaded image for project: 'SystemDS'
  1. SystemDS
  2. SYSTEMDS-1238

Python test failing for LinearRegCG

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • SystemML 0.13
    • SystemML 0.13
    • Algorithms, APIs
    • None

    Description

      deron discovered that the one of the python test (test_mllearn_df.py) with spark 2.1.0 was failing because the test score from linear regression was very low (~ 0.24). I did a some investigation and it turns out the the model parameters computed by the dml script are incorrect. In systemml.12, the values of betas from linear regression model are [152.919, 938.237]. This is what we expect from normal equation. (I also tested this with sklearn). But the values of betas from systemml.13 (with spark 2.1.0) come out to be [153.146, 458.489]. These are not correct and therefore the test score is much lower than expected. The data going into DML script is correct. I printed out the valued of X and Y in dml and I didn't see any issue there.

      Attached are the log files for two different tests (systemml0.12 and 0.13) with explain flag.

      Attachments

        1. python_LinearReg_test_spark.1.6.log
          505 kB
          Imran Younus
        2. python_LinearReg_test_spark.2.1.log
          526 kB
          Imran Younus

        Activity

          People

            niketanpansare Niketan Pansare
            iyounus Imran Younus
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: