Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-9005

RegressionMetrics computing incorrect explainedVariance

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.5.0
    • MLlib
    • None

    Description

      RegressionMetrics currently computes explainedVariance using summary.variance(1) (variance of the residuals) where the Wikipedia definition uses the residual sum of squares math.pow(summary.normL2(1), 2). The two coincide only when the predictor is unbiased (e.g. an intercept term is included in a linear model), but this is not always the case. We should change to be consistent.

      Attachments

        Activity

          People

            fliang Feynman Liang
            fliang Feynman Liang
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: