Uploaded image for project: 'Mahout'
  1. Mahout
  2. MAHOUT-1035

Hotspot in recommenditembased – UnsymmetrifyMapper job

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 0.7
    • 0.8
    • None
    • None

    Description

      While profiling the unsymmetrify mapper job in recommendations we noticed an hotspot consuming 90% of the CPU runtime in org.apache.mahout.math.map.OpenIntDoubleHashMap.keys method for the first map task. We used the script provided in mahout examples for running ASF Email recommendations for profiling. The attached patch addresses the hotspot by reducing the number of for loop iterations in OpenIntDoubleHashMap.keys method by changing the initialization of transposedPartial. This patch while retaining functionality(verified the output with and without patch) speeds up the unsymmetrify mapper task by more than 4X on x86 architectures.

      Attachments

        1. patch_1035.patch
          2 kB
          Bhaskar Devireddy
        2. patch_1035_ver2.patch
          1 kB
          Bhaskar Devireddy
        3. MAHOUT-1035.patch
          3 kB
          Sebastian Schelter

        Activity

          People

            ssc Sebastian Schelter
            bhaskar.devireddy Bhaskar Devireddy
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: