Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-11968

ALS recommend all methods spend most of time in GC

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.5.2, 1.6.0
    • Fix Version/s: 2.2.0
    • Component/s: ML, MLlib
    • Labels:
      None

      Description

      After adding recommendUsersForProducts and recommendProductsForUsers to ALS in spark-perf, I noticed that it takes much longer than ALS itself. Looking at the monitoring page, I can see it is spending about 8min doing GC for each 10min task. That sounds fixable. Looking at the implementation, there is clearly an opportunity to avoid extra allocations: https://github.com/apache/spark/blob/e6dd237463d2de8c506f0735dfdb3f43e8122513/mllib/src/main/scala/org/apache/spark/mllib/recommendation/MatrixFactorizationModel.scala#L283

      CC: Xiangrui Meng

        Issue Links

          Activity

          Hide
          mlnick Nick Pentreath added a comment -

          Issue resolved by pull request 17742
          https://github.com/apache/spark/pull/17742

          Show
          mlnick Nick Pentreath added a comment - Issue resolved by pull request 17742 https://github.com/apache/spark/pull/17742
          Hide
          mlnick Nick Pentreath added a comment -

          Thanks - in the meantime I will take a look at the PR.

          Show
          mlnick Nick Pentreath added a comment - Thanks - in the meantime I will take a look at the PR.
          Hide
          peng.meng@intel.com Peng Meng added a comment -

          Thanks Nick Pentreath , I will post more results here.
          I latest result is I have changed the PriorityQueue to BoundedPriorityQueue. There is about 30% improvement. Will update the PR and the result.

          Show
          peng.meng@intel.com Peng Meng added a comment - Thanks Nick Pentreath , I will post more results here. I latest result is I have changed the PriorityQueue to BoundedPriorityQueue. There is about 30% improvement. Will update the PR and the result.
          Hide
          apachespark Apache Spark added a comment -

          User 'mpjlu' has created a pull request for this issue:
          https://github.com/apache/spark/pull/17742

          Show
          apachespark Apache Spark added a comment - User 'mpjlu' has created a pull request for this issue: https://github.com/apache/spark/pull/17742
          Hide
          mlnick Nick Pentreath added a comment -

          Peng Meng would you mind posting your comments here about the solution from SPARK-20446 as well as the experiment timings? You can rename your PR to include this JIRA (SPARK-11968) in the title instead, in order to link it.

          Also please include the timings here of the ml DataFrame version for comparison. Your approach should also be much faster than the current ml SparkSQL approach, I think.

          I just did some quick tests using MovieLens latest data (~24 million ratings, ~260,000 users, ~39,000 items) and found the following (note these are very rough timings):

          Using default block sizes:

          Current ml master - 262 sec
          My approach: 58 sec
          Your PR: 35 sec

          You're correct that there is still +/- 20-25% GC time overhead using the BLAS 3 + sorting approach. Potentially it could be slightly improved through some form of pre-allocation, but even then it does look like any benefit of BLAS 3 is smaller than the GC cost.

          Show
          mlnick Nick Pentreath added a comment - Peng Meng would you mind posting your comments here about the solution from SPARK-20446 as well as the experiment timings? You can rename your PR to include this JIRA ( SPARK-11968 ) in the title instead, in order to link it. Also please include the timings here of the ml DataFrame version for comparison. Your approach should also be much faster than the current ml SparkSQL approach, I think. I just did some quick tests using MovieLens latest data (~24 million ratings, ~260,000 users, ~39,000 items) and found the following (note these are very rough timings): Using default block sizes: Current ml master - 262 sec My approach: 58 sec Your PR: 35 sec You're correct that there is still +/- 20-25% GC time overhead using the BLAS 3 + sorting approach. Potentially it could be slightly improved through some form of pre-allocation, but even then it does look like any benefit of BLAS 3 is smaller than the GC cost.
          Hide
          mlnick Nick Pentreath added a comment -

          Note, there is a solution proposed in SPARK-20446. I've redirected the discussion to this original JIRA.

          Show
          mlnick Nick Pentreath added a comment - Note, there is a solution proposed in SPARK-20446 . I've redirected the discussion to this original JIRA.
          Hide
          mlnick Nick Pentreath added a comment -

          While working on performance testing for ALS parity I've got a possible solution for this.

          Will update once I have some more concrete numbers.

          Show
          mlnick Nick Pentreath added a comment - While working on performance testing for ALS parity I've got a possible solution for this. Will update once I have some more concrete numbers.
          Hide
          josephkb Joseph K. Bradley added a comment -

          I disagree; this is a problem but is non-trivial to fix. However, I'm OK closing it since I hope we sidestep the issue by porting the implementation to DataFrames before too long.

          Show
          josephkb Joseph K. Bradley added a comment - I disagree; this is a problem but is non-trivial to fix. However, I'm OK closing it since I hope we sidestep the issue by porting the implementation to DataFrames before too long.
          Hide
          srowen Sean Owen added a comment -

          In Progress vs Open really doesn't matter. It's OK to change it back but really this one looks a Not A Problem.

          Show
          srowen Sean Owen added a comment - In Progress vs Open really doesn't matter. It's OK to change it back but really this one looks a Not A Problem.
          Hide
          imatiach Ilya Matiach added a comment -

          Can someone with permissions change the status from In Progress to Open - as the pull request sent was closed and the issue still exists.

          Show
          imatiach Ilya Matiach added a comment - Can someone with permissions change the status from In Progress to Open - as the pull request sent was closed and the issue still exists.
          Hide
          apachespark Apache Spark added a comment -

          User 'rekhajoshm' has created a pull request for this issue:
          https://github.com/apache/spark/pull/9980

          Show
          apachespark Apache Spark added a comment - User 'rekhajoshm' has created a pull request for this issue: https://github.com/apache/spark/pull/9980

            People

            • Assignee:
              peng.meng@intel.com Peng Meng
              Reporter:
              josephkb Joseph K. Bradley
            • Votes:
              1 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development