Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-5273

Improve documentation examples for LinearRegression

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.6.1, 2.0.0
    • Component/s: Documentation
    • Labels:
      None

      Description

      In the document:
      https://spark.apache.org/docs/1.1.1/mllib-linear-methods.html

      Under
      Linear least squares, Lasso, and ridge regression

      The suggested method to use LinearRegressionWithSGD.train()
      // Building the model
      val numIterations = 100
      val model = LinearRegressionWithSGD.train(parsedData, numIterations)

      is not ideal even for simple examples such as y=x. This should be replaced with more real world parameters with step size:

      val lr = new LinearRegressionWithSGD()
      lr.optimizer.setStepSize(0.00000001)
      lr.optimizer.setNumIterations(100)

      or

      LinearRegressionWithSGD.train(input,100,0.00000001)

      To create a reasonable MSE. It took me a while using the dev forum to learn that the step size should be really small. Might help save someone the same effort when learning mllib.

        Attachments

          Activity

            People

            • Assignee:
              srowen Sean R. Owen
              Reporter:
              devl.development Dev Lakhani
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: