Mahout
  1. Mahout
  2. MAHOUT-1288

Create recommendation as search demo

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.9
    • Component/s: None
    • Labels:
      None

      Description

      The basic idea is that a recommendation engine can be deployed by doing off-line analysis to find anomalously cooccurring items and then indexing these items in a search engine. These anomalous items are considered indicators which are indexed using an ordinary text search engine. Recommendations are generated by querying the search engine using recent behavior as a query. Recommendations can be combined with geo filtering and text queries.

      See http://bit.ly/18vbbaT for a living design document.

        Activity

        Hide
        Iker Huerga added a comment -

        Hi,

        I'd be very interested in collaborating on the development of this new feature. Are you guys planning to use JIRA in order to organize the user stories and tasks? It would be worth to schedule a conference call, maybe a hangout, to kick this off.

        Thanks
        Iker

        Show
        Iker Huerga added a comment - Hi, I'd be very interested in collaborating on the development of this new feature. Are you guys planning to use JIRA in order to organize the user stories and tasks? It would be worth to schedule a conference call, maybe a hangout, to kick this off. Thanks Iker
        Hide
        Pat Ferrel added a comment -

        Have the ingest, create similarity matrix and write to csv workflow running. This creates the docs that will be indexed by Solr as described in the design doc.

        This also includes the oft-talked about cross-action recommender in the form of a modified RecommenderJob. Based on the Mahout Item-Based Recommender implemented on Hadoop it implements two new jobs, XRecommenderJob and PreparePreferenceMatricesJob. These ultimately create all recs for all users and can be used separate from Solr.

        Both single and cross-action data ([B'B], [B'A], B, and A) is written to csv files for indexing by Solr where recs can be returned by query to Solr.

        All but the first ingest phase runs in hadoop.

        https://github.com/pferrel/solr-recommender

        Proceeding to integrate with Solr and create the query API.

        Show
        Pat Ferrel added a comment - Have the ingest, create similarity matrix and write to csv workflow running. This creates the docs that will be indexed by Solr as described in the design doc. This also includes the oft-talked about cross-action recommender in the form of a modified RecommenderJob. Based on the Mahout Item-Based Recommender implemented on Hadoop it implements two new jobs, XRecommenderJob and PreparePreferenceMatricesJob. These ultimately create all recs for all users and can be used separate from Solr. Both single and cross-action data ( [B'B] , [B'A] , B, and A) is written to csv files for indexing by Solr where recs can be returned by query to Solr. All but the first ingest phase runs in hadoop. https://github.com/pferrel/solr-recommender Proceeding to integrate with Solr and create the query API.
        Hide
        Ted Dunning added a comment -

        Awesome work, Pat.

        Show
        Ted Dunning added a comment - Awesome work, Pat.
        Hide
        Pat Ferrel added a comment -

        updated to work with mahout 0.9, removed a hacked version of mahout's item-based RecommenderJob because it's no longer needed after the default working directory was made visible and it is now an option to output a sequence file.

        This has not been tested on a cluster yet. Upgrading mine to hadoop 1.2.1 to test it. Passes the hand run test in /scripts using the local fs with hadoop 1.2.1

        Show
        Pat Ferrel added a comment - updated to work with mahout 0.9, removed a hacked version of mahout's item-based RecommenderJob because it's no longer needed after the default working directory was made visible and it is now an option to output a sequence file. This has not been tested on a cluster yet. Upgrading mine to hadoop 1.2.1 to test it. Passes the hand run test in /scripts using the local fs with hadoop 1.2.1
        Hide
        Suneel Marthi added a comment -

        Pat/Ted, can this be moved to 1.0?

        Show
        Suneel Marthi added a comment - Pat/Ted, can this be moved to 1.0?
        Hide
        Pat Ferrel added a comment -

        Can what be moved to 1.0? The current github code works with 0.9 and has been cleaned of unneeded duplicate code, thanks to changes Sebastian made to the mapreduce version of the item based recommender used in the project.

        The code isn't in the 0.9 codebase (and probably shouldn't be) you guys will have to decide where to put it. Maybe some examples repo? As far as I'm concerned you can reference my repo if you want. I'm using it for a fairly extensive demo site.

        If you are talking about a public code demo site, that has not been tasked. I'm doing one that will probably not be a public repo since it relies on private data.

        Show
        Pat Ferrel added a comment - Can what be moved to 1.0? The current github code works with 0.9 and has been cleaned of unneeded duplicate code, thanks to changes Sebastian made to the mapreduce version of the item based recommender used in the project. The code isn't in the 0.9 codebase (and probably shouldn't be) you guys will have to decide where to put it. Maybe some examples repo? As far as I'm concerned you can reference my repo if you want. I'm using it for a fairly extensive demo site. If you are talking about a public code demo site, that has not been tasked. I'm doing one that will probably not be a public repo since it relies on private data.
        Hide
        Suneel Marthi added a comment -

        Oops sorry I wasn't quite following this thread and hence was not aware of the history on this. Can this JIRA be marked as resolved then?

        Show
        Suneel Marthi added a comment - Oops sorry I wasn't quite following this thread and hence was not aware of the history on this. Can this JIRA be marked as resolved then?
        Hide
        Pat Ferrel added a comment -

        Yes, subject to Ted's veto. I've done all I plan to for now.

        There is a cross-recommender in the code that could and should be put into Mahout but that would need to be pushed to 1.0 and will be a separate JIRA.

        https://github.com/pferrel/solr-recommender

        BTW there has been one outside contribution to this.

        Show
        Pat Ferrel added a comment - Yes, subject to Ted's veto. I've done all I plan to for now. There is a cross-recommender in the code that could and should be put into Mahout but that would need to be pushed to 1.0 and will be a separate JIRA. https://github.com/pferrel/solr-recommender BTW there has been one outside contribution to this.
        Hide
        Suneel Marthi added a comment -

        Per conversation in this thread, marking this as resolved for 0.9.

        Show
        Suneel Marthi added a comment - Per conversation in this thread, marking this as resolved for 0.9.

          People

          • Assignee:
            Suneel Marthi
            Reporter:
            Ted Dunning
          • Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development