Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-7780

The intercept in LogisticRegressionWithLBFGS should not be regularized

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.0.0
    • Component/s: MLlib
    • Labels:
      None

      Description

      The intercept in Logistic Regression represents a prior on categories which should not be regularized. In MLlib, the regularization is handled through `Updater`, and the `Updater` penalizes all the components without excluding the intercept which resulting poor training accuracy with regularization.

      The new implementation in ML framework handles this properly, and we should call the implementation in ML from MLlib since majority of users are still using MLlib api.

      Note that both of them are doing feature scalings to improve the convergence, and the only difference is ML version doesn't regularize the intercept. As a result, when lambda is zero, they will converge to the same solution.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                holden Holden Karau
                Reporter:
                dbtsai DB Tsai
              • Votes:
                1 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: