Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-11968

ALS recommend all methods spend most of time in GC

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.5.2, 1.6.0
    • 2.2.0
    • ML, MLlib
    • None

    Description

      After adding recommendUsersForProducts and recommendProductsForUsers to ALS in spark-perf, I noticed that it takes much longer than ALS itself. Looking at the monitoring page, I can see it is spending about 8min doing GC for each 10min task. That sounds fixable. Looking at the implementation, there is clearly an opportunity to avoid extra allocations: https://github.com/apache/spark/blob/e6dd237463d2de8c506f0735dfdb3f43e8122513/mllib/src/main/scala/org/apache/spark/mllib/recommendation/MatrixFactorizationModel.scala#L283

      CC: mengxr

      Attachments

        Issue Links

          Activity

            People

              peng.meng@intel.com Peng Meng
              josephkb Joseph K. Bradley
              Votes:
              1 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: