Uploaded image for project: 'Mahout'
  1. Mahout
  2. MAHOUT-1824

Optimize FlinkOpAtA to use upper triangular matrices

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Implemented
    • Affects Version/s: 0.11.2
    • Fix Version/s: 0.12.0
    • Component/s: Flink
    • Labels:

      Description

      Optimize FlinkOpAtA to use upper triangular matrices (similar to what's being done in Spark backend).

      Presently dals fails on FlinkOpAtA computation with an OOM

      57766 [flink-akka.actor.default-dispatcher-5] ERROR akka.actor.ActorSystemImpl  - exception on LARS’ timer thread
      java.lang.OutOfMemoryError: GC overhead limit exceeded
      57770 [flink-akka.actor.default-dispatcher-5] ERROR akka.actor.ActorSystemImpl  - Uncaught fatal error from thread [flink-scheduler-1] shutting down ActorSystem [flink]
      java.lang.OutOfMemoryError: GC overhead limit exceeded
      - dals *** FAILED ***
        org.apache.flink.runtime.client.JobTimeoutException: Timeout while waiting for JobManager answer. Job time exceeded 21474835 seconds
        at org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:136)
        at org.apache.flink.runtime.minicluster.FlinkMiniCluster.submitJobAndWait(FlinkMiniCluster.scala:423)
        at org.apache.flink.runtime.minicluster.FlinkMiniCluster.submitJobAndWait(FlinkMiniCluster.scala:409)
        at org.apache.flink.runtime.minicluster.FlinkMiniCluster.submitJobAndWait(FlinkMiniCluster.scala:401)
        at org.apache.flink.client.LocalExecutor.executePlan(LocalExecutor.java:190)
        at org.apache.flink.api.java.LocalEnvironment.execute(LocalEnvironment.java:90)
        at org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:855)
        at org.apache.flink.api.scala.ExecutionEnvironment.execute(ExecutionEnvironment.scala:638)
        at org.apache.flink.api.scala.DataSet.collect(DataSet.scala:546)
        at org.apache.mahout.flinkbindings.blas.FlinkOpAtA$.slim(FlinkOpAtA.scala:53)
        ...
        Cause: akka.pattern.AskTimeoutException: Ask timed out on [Actor[akka://flink/user/$a#372851579]] after [21474835000 ms]
        at akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:333)
        at akka.actor.Scheduler$$anon$7.run(Scheduler.scala:117)
        at akka.actor.LightArrayRevolverScheduler$TaskHolder.run(Scheduler.scala:476)
        at akka.actor.LightArrayRevolverScheduler$$anonfun$close$1.apply(Scheduler.scala:282)
        at akka.actor.LightArrayRevolverScheduler$$anonfun$close$1.apply(Scheduler.scala:281)
        at scala.collection.Iterator$class.foreach(Iterator.scala:727)
        at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
        at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
        at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
        at akka.actor.LightArrayRevolverScheduler.close(Scheduler.scala:280)
      
      

        Activity

        Hide
        githubbot ASF GitHub Bot added a comment -

        GitHub user andrewpalumbo opened a pull request:

        https://github.com/apache/mahout/pull/211

        MAHOUT-1824 optimize FlinkOpAtA.slim() to use upper triangle matrix

        You can merge this pull request into a Git repository by running:

        $ git pull https://github.com/andrewpalumbo/mahout MAHOUT-1824

        Alternatively you can review and apply these changes as the patch at:

        https://github.com/apache/mahout/pull/211.patch

        To close this pull request, make a commit to your master/trunk branch
        with (at least) the following in the commit message:

        This closes #211


        commit 17f57bb0ff393e6d7bec8263e22a3e6301989a23
        Author: Andrew Palumbo <apalumbo@apache.org>
        Date: 2016-04-08T05:57:24Z

        wip: begin upper triangle matrix usage from a rowwise dataset

        commit 32425078a44974e2fa556fb47d60499e91840163
        Author: Andrew Palumbo <apalumbo@apache.org>
        Date: 2016-04-08T06:09:16Z

        Finished OpAtA.slim optimization: getting

        java.lang.IllegalStateException: unread block data
        at java.io.ObjectInputStream.setBlockDataMode(ObjectInputStream.java:2431)

        error now

        commit bc0112937987f7a311473d3067eece62202feb61
        Author: Andrew Palumbo <apalumbo@apache.org>
        Date: 2016-04-08T06:17:11Z

        cleanup

        commit c4b3e91fa4aac2ed6fbb61950ef9f3cc0bf8368c
        Author: Andrew Palumbo <apalumbo@apache.org>
        Date: 2016-04-08T07:24:05Z

        tweak akka configs


        Show
        githubbot ASF GitHub Bot added a comment - GitHub user andrewpalumbo opened a pull request: https://github.com/apache/mahout/pull/211 MAHOUT-1824 optimize FlinkOpAtA.slim() to use upper triangle matrix You can merge this pull request into a Git repository by running: $ git pull https://github.com/andrewpalumbo/mahout MAHOUT-1824 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/mahout/pull/211.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #211 commit 17f57bb0ff393e6d7bec8263e22a3e6301989a23 Author: Andrew Palumbo <apalumbo@apache.org> Date: 2016-04-08T05:57:24Z wip: begin upper triangle matrix usage from a rowwise dataset commit 32425078a44974e2fa556fb47d60499e91840163 Author: Andrew Palumbo <apalumbo@apache.org> Date: 2016-04-08T06:09:16Z Finished OpAtA.slim optimization: getting java.lang.IllegalStateException: unread block data at java.io.ObjectInputStream.setBlockDataMode(ObjectInputStream.java:2431) error now commit bc0112937987f7a311473d3067eece62202feb61 Author: Andrew Palumbo <apalumbo@apache.org> Date: 2016-04-08T06:17:11Z cleanup commit c4b3e91fa4aac2ed6fbb61950ef9f3cc0bf8368c Author: Andrew Palumbo <apalumbo@apache.org> Date: 2016-04-08T07:24:05Z tweak akka configs
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user andrewpalumbo commented on the pull request:

        https://github.com/apache/mahout/pull/211#issuecomment-207294922

        closed minus the "tweak akka configs" by commit 4fc65d4e26957cfef68eb30e0bf712758e21a5a1 the flink branch

        Show
        githubbot ASF GitHub Bot added a comment - Github user andrewpalumbo commented on the pull request: https://github.com/apache/mahout/pull/211#issuecomment-207294922 closed minus the "tweak akka configs" by commit 4fc65d4e26957cfef68eb30e0bf712758e21a5a1 the flink branch
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user andrewpalumbo closed the pull request at:

        https://github.com/apache/mahout/pull/211

        Show
        githubbot ASF GitHub Bot added a comment - Github user andrewpalumbo closed the pull request at: https://github.com/apache/mahout/pull/211
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Mahout-Quality #3324 (See https://builds.apache.org/job/Mahout-Quality/3324/)
        MAHOUT-1824: Optimize FlinkOpAtA to use upper triangular matrices. (apalumbo: rev 4fc65d4e26957cfef68eb30e0bf712758e21a5a1)

        • flink/src/test/scala/org/apache/mahout/flinkbindings/FailingTestsSuite.scala
        • flink/src/main/scala/org/apache/mahout/flinkbindings/blas/FlinkOpAtA.scala
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Mahout-Quality #3324 (See https://builds.apache.org/job/Mahout-Quality/3324/ ) MAHOUT-1824 : Optimize FlinkOpAtA to use upper triangular matrices. (apalumbo: rev 4fc65d4e26957cfef68eb30e0bf712758e21a5a1) flink/src/test/scala/org/apache/mahout/flinkbindings/FailingTestsSuite.scala flink/src/main/scala/org/apache/mahout/flinkbindings/blas/FlinkOpAtA.scala
        Hide
        smarthi Suneel Marthi added a comment -

        Closing issues following Mahout 0.12.0 release on April 11, 2016

        Show
        smarthi Suneel Marthi added a comment - Closing issues following Mahout 0.12.0 release on April 11, 2016

          People

          • Assignee:
            Andrew_Palumbo Andrew Palumbo
            Reporter:
            smarthi Suneel Marthi
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development

                Agile