Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-2085

Apply user-specific regularization instead of uniform regularization in Alternating Least Squares (ALS)

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Implemented
    • 1.0.0
    • 1.1.0
    • MLlib
    • None

    Description

      The current implementation of ALS takes a single regularization parameter and apply it on both of the user factors and the product factors. This kind of regularization can be less effective while user number is significantly larger than the number of products (and vice versa). For example, if we have 10M users and 1K product, regularization on user factors will dominate. Following the discussion in [this thread](http://apache-spark-user-list.1001560.n3.nabble.com/possible-bug-in-Spark-s-ALS-implementation-tt2567.html#a2704), the implementation in this PR will regularize each factor vector by #ratings * lambda.

      Link to PR: https://github.com/apache/spark/pull/1026

      Attachments

        Activity

          People

            Unassigned Unassigned
            coderxiang Shuo Xiang
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: