Uploaded image for project: 'Mahout'
  1. Mahout
  2. MAHOUT-1815

dsqDist(X,Y) and dsqDist(X) failing in flink tests.

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

        test("dsqDist(X,Y)") {
          val m = 100
          val n = 300
          val d = 7
          val mxX = Matrices.symmetricUniformView(m, d, 12345).cloned -= 5
          val mxY = Matrices.symmetricUniformView(n, d, 1234).cloned += 10
          val (drmX, drmY) = (drmParallelize(mxX, 3), drmParallelize(mxY, 4))
      
          val mxDsq = dsqDist(drmX, drmY).collect
          val mxDsqControl = new DenseMatrix(m, n) := { (r, c, _) ⇒ (mxX(r, ::) - mxY(c, ::)) ^= 2 sum }
          (mxDsq - mxDsqControl).norm should be < 1e-7
        }
      

      And

       test("dsqDist(X)") {
          val m = 100
          val d = 7
          val mxX = Matrices.symmetricUniformView(m, d, 12345).cloned -= 5
          val drmX = drmParallelize(mxX, 3)
      
          val mxDsq = dsqDist(drmX).collect
          val mxDsqControl = sqDist(drmX)
          (mxDsq - mxDsqControl).norm should be < 1e-7
        }
      

      are both failing in flink tests with arrayOutOfBounds Exceptions:

      03/15/2016 17:02:19	DataSink (org.apache.flink.api.java.Utils$CollectHelper@568b43ab)(5/10) switched to FINISHED 
      1 [CHAIN GroupReduce (GroupReduce at org.apache.mahout.flinkbindings.blas.FlinkOpAtB$.notZippable(FlinkOpAtB.scala:78)) -> Map (Map at org.apache.mahout.flinkbindings.blas.FlinkOpMapBlock$.apply(FlinkOpMapBlock.scala:37)) -> FlatMap (FlatMap at org.apache.mahout.flinkbindings.drm.BlockifiedFlinkDrm.asRowWise(FlinkDrm.scala:93)) (8/10)] ERROR org.apache.flink.runtime.operators.BatchTask  - Error in task code:  CHAIN GroupReduce (GroupReduce at org.apache.mahout.flinkbindings.blas.FlinkOpAtB$.notZippable(FlinkOpAtB.scala:78)) -> Map (Map at org.apache.mahout.flinkbindings.blas.FlinkOpMapBlock$.apply(FlinkOpMapBlock.scala:37)) -> FlatMap (FlatMap at org.apache.mahout.flinkbindings.drm.BlockifiedFlinkDrm.asRowWise(FlinkDrm.scala:93)) (8/10)
      java.lang.ArrayIndexOutOfBoundsException: 5
      	at org.apache.mahout.math.drm.package$$anonfun$4$$anonfun$apply$3.apply(package.scala:317)
      	at org.apache.mahout.math.drm.package$$anonfun$4$$anonfun$apply$3.apply(package.scala:317)
      	at org.apache.mahout.math.scalabindings.MatrixOps$$anonfun$$colon$eq$3$$anonfun$apply$2.apply(MatrixOps.scala:164)
      	at org.apache.mahout.math.scalabindings.MatrixOps$$anonfun$$colon$eq$3$$anonfun$apply$2.apply(MatrixOps.scala:164)
      	at scala.collection.Iterator$class.foreach(Iterator.scala:727)
      	at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
      	at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
      	at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
      	at org.apache.mahout.math.scalabindings.MatrixOps$$anonfun$$colon$eq$3.apply(MatrixOps.scala:164)
      	at org.apache.mahout.math.scalabindings.MatrixOps$$anonfun$$colon$eq$3.apply(MatrixOps.scala:164)
      	at scala.collection.Iterator$class.foreach(Iterator.scala:727)
      	at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
      	at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
      	at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
      	at org.apache.mahout.math.scalabindings.MatrixOps.$colon$eq(MatrixOps.scala:164)
      	at org.apache.mahout.math.drm.package$$anonfun$4.apply(package.scala:317)
      	at org.apache.mahout.math.drm.package$$anonfun$4.apply(package.scala:311)
      	at org.apache.mahout.flinkbindings.blas.FlinkOpMapBlock$$anonfun$1.apply(FlinkOpMapBlock.scala:39)
      	at org.apache.mahout.flinkbindings.blas.FlinkOpMapBlock$$anonfun$1.apply(FlinkOpMapBlock.scala:38)
      	at org.apache.flink.api.scala.DataSet$$anon$1.map(DataSet.scala:297)
      	at org.apache.flink.runtime.operators.chaining.ChainedMapDriver.collect(ChainedMapDriver.java:78)
      	at org.apache.mahout.flinkbindings.blas.FlinkOpAtB$$anon$6.reduce(FlinkOpAtB.scala:86)
      	at org.apache.flink.runtime.operators.GroupReduceDriver.run(GroupReduceDriver.java:125)
      	at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:480)
      	at org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:345)
      	at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559)
      	at java.lang.Thread.run(Thread.java:745)
      

        Issue Links

          Activity

          Hide
          Andrew_Palumbo Andrew Palumbo added a comment - - edited

          The exception is thrown here:
          in dsqDist(X)

                block := { (r, c, x) ⇒ s(keys(r)) + s(c) - 2 * x}
          

          and similarilly in dsqDrist(X,Y).

          The offending call is keys(r) where r > keys.size

          this can be seen in the following trace from within dsqDist(X):

           Keys.size: 5  block.rrow: 10
          

          As these tests pass in H2O and Spark, it seems that this is likely due to some partitioning problems in Flink Bindings.

          ie. (Key, block) tuples are somehow being shuffled/mangled.

          Show
          Andrew_Palumbo Andrew Palumbo added a comment - - edited The exception is thrown here: in dsqDist(X) block := { (r, c, x) ⇒ s(keys(r)) + s(c) - 2 * x} and similarilly in dsqDrist(X,Y) . The offending call is keys(r) where r > keys.size this can be seen in the following trace from within dsqDist(X) : Keys.size: 5 block.rrow: 10 As these tests pass in H2O and Spark, it seems that this is likely due to some partitioning problems in Flink Bindings. ie. (Key, block) tuples are somehow being shuffled/mangled.
          Hide
          Andrew_Palumbo Andrew Palumbo added a comment -

          error is likely due to the implementation of FlinkOpAtB which sets matrix block rows with an arbitrary factor of 10:

              val blockHeight = 10
          
          Show
          Andrew_Palumbo Andrew Palumbo added a comment - error is likely due to the implementation of FlinkOpAtB which sets matrix block rows with an arbitrary factor of 10: val blockHeight = 10
          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user andrewpalumbo opened a pull request:

          https://github.com/apache/mahout/pull/197

          MAHOUT-1815: dsqDist(X,Y) and dsqDist(X) failing in flink tests.

          After taking the Very long way around trying to repartition, etc., it turns out that the row just needed to be properly re-keyed.

          Tests pass now.

          Though we may want to re-examine the implementation of FlinkOpAtB, as it seems pretty inefficient.

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/andrewpalumbo/mahout MAHOUT-1815

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/mahout/pull/197.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #197


          commit 731b87c88f65d7ecc0d098405f27cded5f094fa2
          Author: Andrew Palumbo <apalumbo@apache.org>
          Date: 2016-03-17T22:11:26Z

          properly re-key rows in FlinkOpAtB


          Show
          githubbot ASF GitHub Bot added a comment - GitHub user andrewpalumbo opened a pull request: https://github.com/apache/mahout/pull/197 MAHOUT-1815 : dsqDist(X,Y) and dsqDist(X) failing in flink tests. After taking the Very long way around trying to repartition, etc., it turns out that the row just needed to be properly re-keyed. Tests pass now. Though we may want to re-examine the implementation of FlinkOpAtB, as it seems pretty inefficient. You can merge this pull request into a Git repository by running: $ git pull https://github.com/andrewpalumbo/mahout MAHOUT-1815 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/mahout/pull/197.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #197 commit 731b87c88f65d7ecc0d098405f27cded5f094fa2 Author: Andrew Palumbo <apalumbo@apache.org> Date: 2016-03-17T22:11:26Z properly re-key rows in FlinkOpAtB
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user andrewpalumbo commented on the pull request:

          https://github.com/apache/mahout/pull/197#issuecomment-198109270

          This is a small fix so if there are no objections I'm just gonna push it.

          Show
          githubbot ASF GitHub Bot added a comment - Github user andrewpalumbo commented on the pull request: https://github.com/apache/mahout/pull/197#issuecomment-198109270 This is a small fix so if there are no objections I'm just gonna push it.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user smarthi commented on the pull request:

          https://github.com/apache/mahout/pull/197#issuecomment-198109713

          LGTM

          Show
          githubbot ASF GitHub Bot added a comment - Github user smarthi commented on the pull request: https://github.com/apache/mahout/pull/197#issuecomment-198109713 LGTM
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user andrewpalumbo commented on the pull request:

          https://github.com/apache/mahout/pull/197#issuecomment-198109765

          thx

          Show
          githubbot ASF GitHub Bot added a comment - Github user andrewpalumbo commented on the pull request: https://github.com/apache/mahout/pull/197#issuecomment-198109765 thx
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user andrewpalumbo commented on the pull request:

          https://github.com/apache/mahout/pull/197#issuecomment-198110346

          merged to flink-binding

          Show
          githubbot ASF GitHub Bot added a comment - Github user andrewpalumbo commented on the pull request: https://github.com/apache/mahout/pull/197#issuecomment-198110346 merged to flink-binding
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user andrewpalumbo closed the pull request at:

          https://github.com/apache/mahout/pull/197

          Show
          githubbot ASF GitHub Bot added a comment - Github user andrewpalumbo closed the pull request at: https://github.com/apache/mahout/pull/197
          Hide
          Andrew_Palumbo Andrew Palumbo added a comment - - edited

          merged to apache/flink-binding

          Show
          Andrew_Palumbo Andrew Palumbo added a comment - - edited merged to apache/flink-binding
          Hide
          dlyubimov Dmitriy Lyubimov added a comment -

          bulk-closing resolved issues

          Show
          dlyubimov Dmitriy Lyubimov added a comment - bulk-closing resolved issues
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Mahout-Quality #3324 (See https://builds.apache.org/job/Mahout-Quality/3324/)
          MAHOUT-1815: dsqDist(X,Y) and dsqDist(X) failing in flink tests. closes (apalumbo: rev 2e8790d5c6e0f337abe55906e052d7236f046207)

          • flink/src/main/scala/org/apache/mahout/flinkbindings/FlinkEngine.scala
          • flink/src/main/scala/org/apache/mahout/flinkbindings/blas/FlinkOpAtB.scala
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Mahout-Quality #3324 (See https://builds.apache.org/job/Mahout-Quality/3324/ ) MAHOUT-1815 : dsqDist(X,Y) and dsqDist(X) failing in flink tests. closes (apalumbo: rev 2e8790d5c6e0f337abe55906e052d7236f046207) flink/src/main/scala/org/apache/mahout/flinkbindings/FlinkEngine.scala flink/src/main/scala/org/apache/mahout/flinkbindings/blas/FlinkOpAtB.scala

            People

            • Assignee:
              Andrew_Palumbo Andrew Palumbo
              Reporter:
              Andrew_Palumbo Andrew Palumbo
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development

                  Agile