Uploaded image for project: 'Mahout'
  1. Mahout
  2. MAHOUT-1663

Port seq2sparse to the Mahout spark-scala Environment

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Won't Fix
    • Affects Version/s: 0.9
    • Fix Version/s: 0.11.1
    • Component/s: None
    • Labels:

      Description

      Implement a scala version of seq2sparse in the spark module. This effort is currently in progress.

        Activity

        Hide
        smarthi Suneel Marthi added a comment -

        Its fine with porting the existing seq2sparse to Spark for 0.10.1 so as to have a complete pipeline. In the long term we need to rethink how we wanna do this. seq2sparse was the big bottleneck in the legacy MR pipeline, not to mention that there was no way to incrementally update the term vectors for new streaming documents.

        There have been discussions in the past about may be using Finite State Automaton (which comes with Lucene since 4.0), or Word2Vec etc. See the discussion in https://issues.apache.org/jira/browse/MAHOUT-1252

        Show
        smarthi Suneel Marthi added a comment - Its fine with porting the existing seq2sparse to Spark for 0.10.1 so as to have a complete pipeline. In the long term we need to rethink how we wanna do this. seq2sparse was the big bottleneck in the legacy MR pipeline, not to mention that there was no way to incrementally update the term vectors for new streaming documents. There have been discussions in the past about may be using Finite State Automaton (which comes with Lucene since 4.0), or Word2Vec etc. See the discussion in https://issues.apache.org/jira/browse/MAHOUT-1252
        Hide
        smarthi Suneel Marthi added a comment -

        No progress on this, closing the jira.

        Show
        smarthi Suneel Marthi added a comment - No progress on this, closing the jira.
        Hide
        smarthi Suneel Marthi added a comment -

        Mahout 0.11.1 Released on November 11, 2015, closing the issues.

        Show
        smarthi Suneel Marthi added a comment - Mahout 0.11.1 Released on November 11, 2015, closing the issues.

          People

          • Assignee:
            gokhancapan Gokhan Capan
            Reporter:
            Andrew_Palumbo Andrew Palumbo
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development