Mahout
  1. Mahout
  2. MAHOUT-1001

Performance improvement in recommenditembased

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 0.6
    • Fix Version/s: 0.7
    • Labels:
      None

      Description

      While running the recommendations with ASFEMail dataset using the example script provided with mahout, we noticed that execution time for unsymmetrify mapper is very long. While profiling the task we noticed a hotspot consuming high CPU cycles. Please find the attached patch addressing issue and optimizes the unsymmetrify mapper class. This patch while retaining functionality(verified the output with and without patch) speeds up the unsymmetrify mapper by more then 5X on x86 architectures.

      1. RowSimilarityJob.patch
        2 kB
        Bhaskar Devireddy

        Activity

        Bhaskar Devireddy created issue -
        Bhaskar Devireddy made changes -
        Field Original Value New Value
        Attachment RowSimilarityJob.patch [ 12524291 ]
        Bhaskar Devireddy made changes -
        Description While running the recommendations with ASFEMail dataset using the example script provided with mahout, we noticed that execution time for unsymmetrify mapper is very long. While profiling the task we noticed a hotspot consuming high CPU cycle. Please find the attached patch addressing issue and optimizes the unsymmetrify mapper class. This patch while retaining functionality(verified the output with and without patch) speeds up the unsymmetrify mapper by more then 5X on x86 architectures. While running the recommendations with ASFEMail dataset using the example script provided with mahout, we noticed that execution time for unsymmetrify mapper is very long. While profiling the task we noticed a hotspot consuming high CPU cycles. Please find the attached patch addressing issue and optimizes the unsymmetrify mapper class. This patch while retaining functionality(verified the output with and without patch) speeds up the unsymmetrify mapper by more then 5X on x86 architectures.
        Hide
        Sean Owen added a comment -

        Looks good, I've committed.

        Show
        Sean Owen added a comment - Looks good, I've committed.
        Sean Owen made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Sean Owen made changes -
        Priority Major [ 3 ] Minor [ 4 ]
        Due Date 25/Apr/12
        Hide
        Hudson added a comment -

        Integrated in Mahout-Quality #1449 (See https://builds.apache.org/job/Mahout-Quality/1449/)
        MAHOUT-1001 optimization of vector allocation (Revision 1330414)

        Result = SUCCESS
        srowen : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1330414
        Files :

        • /mahout/trunk/core/src/main/java/org/apache/mahout/math/hadoop/similarity/cooccurrence/RowSimilarityJob.java
        Show
        Hudson added a comment - Integrated in Mahout-Quality #1449 (See https://builds.apache.org/job/Mahout-Quality/1449/ ) MAHOUT-1001 optimization of vector allocation (Revision 1330414) Result = SUCCESS srowen : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1330414 Files : /mahout/trunk/core/src/main/java/org/apache/mahout/math/hadoop/similarity/cooccurrence/RowSimilarityJob.java
        Sean Owen made changes -
        Status Resolved [ 5 ] Closed [ 6 ]

          People

          • Assignee:
            Sean Owen
            Reporter:
            Bhaskar Devireddy
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Due:
              Created:
              Updated:
              Resolved:

              Development