Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-9647 MLlib + SparkR integration for 1.6
  3. SPARK-9836

Provide R-like summary statistics for ordinary least squares via normal equation solver

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • None
    • 1.6.0
    • ML
    • None

    Description

      In R, model fitting comes with summary statistics. We can provide most of those via normal equation solver (SPARK-9834). If some statistics requires additional passes to the dataset, we can expose an option to let users select desired statistics before model fitting.

      > summary(model)
      
      Call:
      glm(formula = Sepal.Length ~ Sepal.Width + Species, data = iris)
      
      Deviance Residuals: 
           Min        1Q    Median        3Q       Max  
      -1.30711  -0.25713  -0.05325   0.19542   1.41253  
      
      Coefficients:
                        Estimate Std. Error t value Pr(>|t|)    
      (Intercept)         2.2514     0.3698   6.089 9.57e-09 ***
      Sepal.Width         0.8036     0.1063   7.557 4.19e-12 ***
      Speciesversicolor   1.4587     0.1121  13.012  < 2e-16 ***
      Speciesvirginica    1.9468     0.1000  19.465  < 2e-16 ***
      ---
      Signif. codes:  
      0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
      
      (Dispersion parameter for gaussian family taken to be 0.1918059)
      
          Null deviance: 102.168  on 149  degrees of freedom
      Residual deviance:  28.004  on 146  degrees of freedom
      AIC: 183.94
      
      Number of Fisher Scoring iterations: 2
      

      Attachments

        Issue Links

          Activity

            People

              yanboliang Yanbo Liang
              mengxr Xiangrui Meng
              Xiangrui Meng Xiangrui Meng
              Votes:
              2 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: