Uploaded image for project: 'Mahout'
  1. Mahout
  2. MAHOUT-1035

Hotspot in recommenditembased – UnsymmetrifyMapper job

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 0.7
    • Fix Version/s: 0.8
    • Labels:
      None

      Description

      While profiling the unsymmetrify mapper job in recommendations we noticed an hotspot consuming 90% of the CPU runtime in org.apache.mahout.math.map.OpenIntDoubleHashMap.keys method for the first map task. We used the script provided in mahout examples for running ASF Email recommendations for profiling. The attached patch addresses the hotspot by reducing the number of for loop iterations in OpenIntDoubleHashMap.keys method by changing the initialization of transposedPartial. This patch while retaining functionality(verified the output with and without patch) speeds up the unsymmetrify mapper task by more than 4X on x86 architectures.

        Attachments

        1. patch_1035.patch
          2 kB
          Bhaskar Devireddy
        2. patch_1035_ver2.patch
          1 kB
          Bhaskar Devireddy
        3. MAHOUT-1035.patch
          3 kB
          Sebastian Schelter

          Activity

            People

            • Assignee:
              ssc Sebastian Schelter
              Reporter:
              bhaskar.devireddy Bhaskar Devireddy
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: