Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-5273

Improve documentation examples for LinearRegression

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotVotersStop watchingWatchersCreate sub-taskConvert to sub-taskLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.6.1, 2.0.0
    • Component/s: Documentation
    • Labels:
      None

      Description

      In the document:
      https://spark.apache.org/docs/1.1.1/mllib-linear-methods.html

      Under
      Linear least squares, Lasso, and ridge regression

      The suggested method to use LinearRegressionWithSGD.train()
      // Building the model
      val numIterations = 100
      val model = LinearRegressionWithSGD.train(parsedData, numIterations)

      is not ideal even for simple examples such as y=x. This should be replaced with more real world parameters with step size:

      val lr = new LinearRegressionWithSGD()
      lr.optimizer.setStepSize(0.00000001)
      lr.optimizer.setNumIterations(100)

      or

      LinearRegressionWithSGD.train(input,100,0.00000001)

      To create a reasonable MSE. It took me a while using the dev forum to learn that the step size should be really small. Might help save someone the same effort when learning mllib.

        Attachments

          Activity

          $i18n.getText('security.level.explanation', $currentSelection) Viewable by All Users
          Cancel

            People

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment