Mahout
  1. Mahout
  2. MAHOUT-878

Provide better examples for the parallel ALS recommender code

    Details

    • Type: Task Task
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.0
    • Fix Version/s: 0.6
    • Labels:
      None

      Description

      We should provide examples that show how to apply the parallel ALS recommender to the Netflix or KDD2011 datasets.

      1. MAHOUT-878.patch
        13 kB
        Sebastian Schelter

        Activity

        Hide
        Grant Ingersoll added a comment -

        See also the stuff I did for build-asf-email.sh. Would be nice to add into that.

        Show
        Grant Ingersoll added a comment - See also the stuff I did for build-asf-email.sh. Would be nice to add into that.
        Hide
        Sebastian Schelter added a comment -

        shell script to run parallel ALS on the netflix dataset

        Show
        Sebastian Schelter added a comment - shell script to run parallel ALS on the netflix dataset
        Hide
        Hudson added a comment -

        Integrated in Mahout-Quality #1164 (See https://builds.apache.org/job/Mahout-Quality/1164/)
        MAHOUT-878 Provide better examples for the parallel ALS recommender code

        ssc : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1200366
        Files :

        • /mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/als/FactorizationEvaluator.java
        • /mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/als/ParallelALSFactorizationJob.java
        • /mahout/trunk/examples/bin/factorize-netflix.sh
        • /mahout/trunk/examples/src/main/java/org/apache/mahout/cf/taste/hadoop
        • /mahout/trunk/examples/src/main/java/org/apache/mahout/cf/taste/hadoop/example
        • /mahout/trunk/examples/src/main/java/org/apache/mahout/cf/taste/hadoop/example/als
        • /mahout/trunk/examples/src/main/java/org/apache/mahout/cf/taste/hadoop/example/als/netflix
        • /mahout/trunk/examples/src/main/java/org/apache/mahout/cf/taste/hadoop/example/als/netflix/NetflixDatasetConverter.java
        • /mahout/trunk/math/src/main/java/org/apache/mahout/math/als/AlternatingLeastSquaresSolver.java
        Show
        Hudson added a comment - Integrated in Mahout-Quality #1164 (See https://builds.apache.org/job/Mahout-Quality/1164/ ) MAHOUT-878 Provide better examples for the parallel ALS recommender code ssc : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1200366 Files : /mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/als/FactorizationEvaluator.java /mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/als/ParallelALSFactorizationJob.java /mahout/trunk/examples/bin/factorize-netflix.sh /mahout/trunk/examples/src/main/java/org/apache/mahout/cf/taste/hadoop /mahout/trunk/examples/src/main/java/org/apache/mahout/cf/taste/hadoop/example /mahout/trunk/examples/src/main/java/org/apache/mahout/cf/taste/hadoop/example/als /mahout/trunk/examples/src/main/java/org/apache/mahout/cf/taste/hadoop/example/als/netflix /mahout/trunk/examples/src/main/java/org/apache/mahout/cf/taste/hadoop/example/als/netflix/NetflixDatasetConverter.java /mahout/trunk/math/src/main/java/org/apache/mahout/math/als/AlternatingLeastSquaresSolver.java
        Hide
        Grant Ingersoll added a comment -

        You might also do one for the Amazon product review data set at http://www.cs.uic.edu/~liub/FBS/sentiment-analysis.html That has 5.8M reviews. I've got some sequential preprocessing code that extracts out the items, converts ids to longs and gets the rating.

        Show
        Grant Ingersoll added a comment - You might also do one for the Amazon product review data set at http://www.cs.uic.edu/~liub/FBS/sentiment-analysis.html That has 5.8M reviews. I've got some sequential preprocessing code that extracts out the items, converts ids to longs and gets the rating.
        Hide
        Sebastian Schelter added a comment -

        Would make a nice usecase but 5.8 million is a bit small for a hadoop based solution.

        Show
        Sebastian Schelter added a comment - Would make a nice usecase but 5.8 million is a bit small for a hadoop based solution.
        Hide
        Grant Ingersoll added a comment -

        Sure, but most of are examples are meant to try out locally, too.

        Show
        Grant Ingersoll added a comment - Sure, but most of are examples are meant to try out locally, too.
        Hide
        Sebastian Schelter added a comment -

        Ok. But we already have a small example using the 1 million movielens dataset for this algorithm.

        Show
        Sebastian Schelter added a comment - Ok. But we already have a small example using the 1 million movielens dataset for this algorithm.

          People

          • Assignee:
            Sebastian Schelter
            Reporter:
            Sebastian Schelter
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development