Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.9
    • Fix Version/s: 0.10.0
    • Component/s: None
    • Labels:
      None

      Description

      Spark bindings for Mahout DRM.
      DRM DSL.

      Disclaimer. This will all be experimental at this point.

      The idea is to wrap DRM by Spark RDD with support of some basic functionality, perhaps some humble beginning of Cost-based optimizer

      (0) Spark serialization support for Vector, Matrix
      (1) Bagel transposition
      (2) slim X'X
      (2a) not-so-slim X'X
      (3) blockify() (compose RDD containing vertical blocks of original input)
      (4) read/write Mahout DRM off HDFS
      (5) A'B

      ...

      1. ScalaSparkBindings.pdf
        569 kB
        Dmitriy Lyubimov
      2. BindingsStack.jpg
        72 kB
        Dmitriy Lyubimov

        Issue Links

          Activity

          Hide
          ssc Sebastian Schelter added a comment -

          very excited to see this, I think this is a great direction to look into

          Show
          ssc Sebastian Schelter added a comment - very excited to see this, I think this is a great direction to look into
          Hide
          dlyubimov Dmitriy Lyubimov added a comment -

          One inconvenience that i seem to be running into quite a bit when i am working on this is that there's no efficient matrix block implementation that would just take hanging vectors and put them into hashmap while retaining original matrix geometry configuration (e.g. if I cut out a block (3001:3010,3001:3010), it doesn't allocate hanging vector tables 1:m to represent this).

          SparseRow/ColumnMatrix does almost what i need; except it needs to have the "Vector[] rows" replaced with some sort of HashMap..

          maybe i need a new type, something a SparseBlockRowMatrix. that does that. Unfortunately i feel it creates a little overload in the in-core world of types. I would say SparseRow/ColumnMatrix rather should be modified into this. Seems like a simple change, but of course it has implications for use patterns since it would not provide good sequential row-wise iteration speed.

          without this, i am forced manipulating with hash maps of index-> vector things, but that does not make a nice abstraction and disallows in-core DSL.

          Show
          dlyubimov Dmitriy Lyubimov added a comment - One inconvenience that i seem to be running into quite a bit when i am working on this is that there's no efficient matrix block implementation that would just take hanging vectors and put them into hashmap while retaining original matrix geometry configuration (e.g. if I cut out a block (3001:3010,3001:3010), it doesn't allocate hanging vector tables 1:m to represent this). SparseRow/ColumnMatrix does almost what i need; except it needs to have the "Vector[] rows" replaced with some sort of HashMap.. maybe i need a new type, something a SparseBlockRowMatrix. that does that. Unfortunately i feel it creates a little overload in the in-core world of types. I would say SparseRow/ColumnMatrix rather should be modified into this. Seems like a simple change, but of course it has implications for use patterns since it would not provide good sequential row-wise iteration speed. without this, i am forced manipulating with hash maps of index-> vector things, but that does not make a nice abstraction and disallows in-core DSL.
          Hide
          dlyubimov Dmitriy Lyubimov added a comment - - edited

          https://github.com/dlyubimov/mahout-commits/tree/dev-0.9.x-scala

          i started moving some things there. In particular, ALS is still not there (still haven't hashed it out with my boss). but there some inital matrix algorithms to be picked up (even transposition can be blockified and improved).

          Anyone wanting to give me a hand on this?

          Please dont pick weighted ALS-WR so far, i still hope to finish porting it.

          There are more interesting questions there, like parameter validation and fitting.
          Common problem i have is that suppose you have the implicit feedback approach. Then you reformulate it in terms of preference (P) and confidence (C) inputs. The original paper speaks of a specific scheme of forming C that includes one parameter they want to fit.

          More interesting question is, what if we have more than one parameter? I.e. what if we have a bunch of user behavior, suppose, an item search, browse, click, add2card, and finally, aquisition. That's a whole bunch of parameters to form confidence of user's preference. I.e. it is reasonable to assume that e.g. since every transaction preceeds by add2card, add2card signifies a positive preference in general (we are just far less confident about that). Then again, abandoned cart may also signify a negative preference, or nothing at all.

          Anyway. suppose we want to perform exploration what's worth what. Natural way is to do it, again, thru a crossvalidation . Posing such a problem presents a whole new look at "Big Data ML" problems. Now we are using distributed processing not just because the input might be so big, but also because we have a lot of parameter space exploration to do (even if the one iteration problem is not so big). And thus produce more interesting analytical results.

          However, since there are many parameters, the task becomes fairly more interesting. since there is not so much test data (we still should assume we will have just a handful of crossvalidation runs) various "online" convex searching techniques like SGD or BFGS are not going to be very viable. what i was thinking of, maybe we can start running parallel tries and fit the data into paraboloids (i.e. second degree polynomial regression without interaction terms). That might be a big assumption but that would be enough to get a general sense where global maximum may be even on inputs of a fairly small size. Of course we may discover hyperbolic parabaloid properties along some parameter axes. in which case it would mean we got the preference wrong, so we flip the preference mapping. (i.e. click = (P=1, C=0.5) would flip into click = (P=0, C=0...) and re-validate again. This is kind of multidimensional variation of one-parameter second degree polynom fitting that Raphael refered to once.

          We are taking on a lot of assumptions here (parameter independence, existence of a good global maximum etc. etc). Perhaps there's something better to automate that search?

          thanks .
          -Dmitriy

          Show
          dlyubimov Dmitriy Lyubimov added a comment - - edited https://github.com/dlyubimov/mahout-commits/tree/dev-0.9.x-scala i started moving some things there. In particular, ALS is still not there (still haven't hashed it out with my boss). but there some inital matrix algorithms to be picked up (even transposition can be blockified and improved). Anyone wanting to give me a hand on this? Please dont pick weighted ALS-WR so far, i still hope to finish porting it. There are more interesting questions there, like parameter validation and fitting. Common problem i have is that suppose you have the implicit feedback approach. Then you reformulate it in terms of preference (P) and confidence (C) inputs. The original paper speaks of a specific scheme of forming C that includes one parameter they want to fit. More interesting question is, what if we have more than one parameter? I.e. what if we have a bunch of user behavior, suppose, an item search, browse, click, add2card, and finally, aquisition. That's a whole bunch of parameters to form confidence of user's preference. I.e. it is reasonable to assume that e.g. since every transaction preceeds by add2card, add2card signifies a positive preference in general (we are just far less confident about that). Then again, abandoned cart may also signify a negative preference, or nothing at all. Anyway. suppose we want to perform exploration what's worth what. Natural way is to do it, again, thru a crossvalidation . Posing such a problem presents a whole new look at "Big Data ML" problems. Now we are using distributed processing not just because the input might be so big, but also because we have a lot of parameter space exploration to do (even if the one iteration problem is not so big). And thus produce more interesting analytical results. However, since there are many parameters, the task becomes fairly more interesting. since there is not so much test data (we still should assume we will have just a handful of crossvalidation runs) various "online" convex searching techniques like SGD or BFGS are not going to be very viable. what i was thinking of, maybe we can start running parallel tries and fit the data into paraboloids (i.e. second degree polynomial regression without interaction terms). That might be a big assumption but that would be enough to get a general sense where global maximum may be even on inputs of a fairly small size. Of course we may discover hyperbolic parabaloid properties along some parameter axes. in which case it would mean we got the preference wrong, so we flip the preference mapping. (i.e. click = (P=1, C=0.5) would flip into click = (P=0, C=0...) and re-validate again. This is kind of multidimensional variation of one-parameter second degree polynom fitting that Raphael refered to once. We are taking on a lot of assumptions here (parameter independence, existence of a good global maximum etc. etc). Perhaps there's something better to automate that search? thanks . -Dmitriy
          Hide
          dlyubimov Dmitriy Lyubimov added a comment -

          P.S. I am kind of dubious step-recorded search would be of sufficient efficiency either. First, we should not assume we are running a good convex landscape. Second, i assume step-recorded search may take fairly long .

          Show
          dlyubimov Dmitriy Lyubimov added a comment - P.S. I am kind of dubious step-recorded search would be of sufficient efficiency either. First, we should not assume we are running a good convex landscape. Second, i assume step-recorded search may take fairly long .
          Hide
          dlyubimov Dmitriy Lyubimov added a comment -

          PPS. the spark module have a specific CDH-2.0 profile to build against CDH 2.0 releases (could be plain hadoop, but that's what i just happen to be using at the moment). Which is what Spark 0.8 is built against a lot. welcome to add more 2.0 profiles.

          Show
          dlyubimov Dmitriy Lyubimov added a comment - PPS. the spark module have a specific CDH-2.0 profile to build against CDH 2.0 releases (could be plain hadoop, but that's what i just happen to be using at the moment). Which is what Spark 0.8 is built against a lot. welcome to add more 2.0 profiles.
          Hide
          ssc Sebastian Schelter added a comment -

          I can help by looking at the sources from time to time. I'm also working on using mahout vectors in Spark at the moment, yet in a slighty different context.

          Show
          ssc Sebastian Schelter added a comment - I can help by looking at the sources from time to time. I'm also working on using mahout vectors in Spark at the moment, yet in a slighty different context.
          Hide
          dlyubimov Dmitriy Lyubimov added a comment -

          can that context be part of Mahout? Or that would be way off ?

          Show
          dlyubimov Dmitriy Lyubimov added a comment - can that context be part of Mahout? Or that would be way off ?
          Hide
          ssc Sebastian Schelter added a comment -

          Its way off unfortunately, its exploratory research.

          Show
          ssc Sebastian Schelter added a comment - Its way off unfortunately, its exploratory research.
          Hide
          dlyubimov Dmitriy Lyubimov added a comment -

          Aha. Spark 0.9.0 with GraphX is finally released. Time to get hands dirty a bit in this methinks..

          Show
          dlyubimov Dmitriy Lyubimov added a comment - Aha. Spark 0.9.0 with GraphX is finally released. Time to get hands dirty a bit in this methinks..
          Hide
          ssc Sebastian Schelter added a comment -

          Big +1 on that.

          Show
          ssc Sebastian Schelter added a comment - Big +1 on that.
          Hide
          dlyubimov Dmitriy Lyubimov added a comment -

          This is now tracked here https://github.com/dlyubimov/mahout-commits/tree/dev-1.0-spark
          new module spark.

          I have been rewriting certain things anew.

          Concepts :
          (a) Logical operators (including DRM sources) are expressed as DRMLike trait.
          (b) taking a note from spark book, DRM operators (such as %*% or t) form operator lineage. Operator lineage does not get optimized into RDD until "action" applied (spark terminology used).

          (c) Unlike in spark, "action" doesn't really cause any execution but (1) forming optimized RDD sequence (2) producing "checkpointed" DRM. Consequently, "checkpointed" DRM has RDD lineage attached to it, which is also marked for cacheing. Subsequently additional lineages starting out of a checkpointed DRM, will not be able to optimize beyond this checkpoint.

          (d) there's a "super action" on checkpointed RDD - such as collection or persitence to HDFS that triggers, if necessary, optimization checkpoint and Spark action.

          E.g.

          val A = drmParallelize(...)
          
          // doesn't do anything, give opportunity for operator lineage to grow further before being optimized
          val squaredA = A.t %*% A
          
          // we may trigger optimizer and RDD lineage generation and cacheing explicitly by: 
          squaredA.checkpoint()
          
          // Or, we can call "superAction" directly. This will trigger checkpoint() implicitly if not yet done
          val inCoreSquaredA = squaredA.collect()
          

          Generally, i support for very few things – I actually dropped all previously implemented Bagel algorithms. So in fact i have less support now than in 0.9 branch.

          i have kryo support for Mahout vectors and matrix blocks.
          I have hdfs read/write of Mahout's DRM into DRMLike trait.

          I have some DSL defined such as
          A %*% B
          A %*% inCoreB
          inCoreA %*%: B

          A.t
          inCoreA = A.collect

          A.blockify (coalesces split records into RDD of vertical blocks – sort of paradigm simiilar to MLI's MatrixSubmatrix except I implemented it before MLI was announced for the first time so no MLI influence here in fact )

          So now i need to reimplement what Bagel used to be doing, plus optimizer rules for choosing distributed algorithm based on cost rules.

          In fact i came to conclusion there was 0 benefit in using Bagel in the first place, since it just maps all its primitives into shuffle-and-hash group-by RDD operations so there is no any actual operational benefit to using it.

          I probably will reconstitute algorithms at the first iteration using regular spark primitives (groupBy and cartesian for multiplication blocks)

          Once i plug missing pieces (e.g. slim matrix multiplication) I bet i would be able to fit distributed SSVD version in 40 lines just like the in-core one

          Weighted ALS will still be looking less elegant because of some lacking features in linear algebra. For example, it seems like sparse block support (i.e. bunch of sparse row or column vectors hanging off a very small hash map instead of full-size array as in SparseRow(column)Matrix today), but still mostly R-like scripted as far as working with matrix blocks and decompositions.

          So at this point i'd be willing to hear input on these ideas and direction. Perhaps some suggestions. Thanks.

          Show
          dlyubimov Dmitriy Lyubimov added a comment - This is now tracked here https://github.com/dlyubimov/mahout-commits/tree/dev-1.0-spark new module spark. I have been rewriting certain things anew. Concepts : (a) Logical operators (including DRM sources) are expressed as DRMLike trait. (b) taking a note from spark book, DRM operators (such as %*% or t) form operator lineage. Operator lineage does not get optimized into RDD until "action" applied (spark terminology used). (c) Unlike in spark, "action" doesn't really cause any execution but (1) forming optimized RDD sequence (2) producing "checkpointed" DRM. Consequently, "checkpointed" DRM has RDD lineage attached to it, which is also marked for cacheing. Subsequently additional lineages starting out of a checkpointed DRM, will not be able to optimize beyond this checkpoint. (d) there's a "super action" on checkpointed RDD - such as collection or persitence to HDFS that triggers, if necessary, optimization checkpoint and Spark action. E.g. val A = drmParallelize(...) // doesn't do anything, give opportunity for operator lineage to grow further before being optimized val squaredA = A.t %*% A // we may trigger optimizer and RDD lineage generation and cacheing explicitly by: squaredA.checkpoint() // Or, we can call "superAction" directly. This will trigger checkpoint() implicitly if not yet done val inCoreSquaredA = squaredA.collect() Generally, i support for very few things – I actually dropped all previously implemented Bagel algorithms. So in fact i have less support now than in 0.9 branch. i have kryo support for Mahout vectors and matrix blocks. I have hdfs read/write of Mahout's DRM into DRMLike trait. I have some DSL defined such as A %*% B A %*% inCoreB inCoreA %*%: B A.t inCoreA = A.collect A.blockify (coalesces split records into RDD of vertical blocks – sort of paradigm simiilar to MLI's MatrixSubmatrix except I implemented it before MLI was announced for the first time so no MLI influence here in fact ) So now i need to reimplement what Bagel used to be doing, plus optimizer rules for choosing distributed algorithm based on cost rules. In fact i came to conclusion there was 0 benefit in using Bagel in the first place, since it just maps all its primitives into shuffle-and-hash group-by RDD operations so there is no any actual operational benefit to using it. I probably will reconstitute algorithms at the first iteration using regular spark primitives (groupBy and cartesian for multiplication blocks) Once i plug missing pieces (e.g. slim matrix multiplication) I bet i would be able to fit distributed SSVD version in 40 lines just like the in-core one Weighted ALS will still be looking less elegant because of some lacking features in linear algebra. For example, it seems like sparse block support (i.e. bunch of sparse row or column vectors hanging off a very small hash map instead of full-size array as in SparseRow(column)Matrix today), but still mostly R-like scripted as far as working with matrix blocks and decompositions. So at this point i'd be willing to hear input on these ideas and direction. Perhaps some suggestions. Thanks.
          Hide
          dlyubimov Dmitriy Lyubimov added a comment -

          a few obvious optimizer rules

          A.t %*% A is obviously detected as a family of unary algorithsm rather than a binary multiplication alborithm

          Geometry and non-zero element estimate plays role in selection of type of algorithm.

          Biggest multiplication via group-by will have to deal, obviously, with cartesian operator and will apply to (A * B')

          Obvious rewrites:
          A'*B' = (B * A )' (transposition push-up, including elementwise operators too)
          (A')' = A (transposition merge)
          cost based grouping (A*B)C versus A(B*C)
          special distributed algorithm versions for in-core operands and diagonal matrices

          Show
          dlyubimov Dmitriy Lyubimov added a comment - a few obvious optimizer rules A.t %*% A is obviously detected as a family of unary algorithsm rather than a binary multiplication alborithm Geometry and non-zero element estimate plays role in selection of type of algorithm. Biggest multiplication via group-by will have to deal, obviously, with cartesian operator and will apply to (A * B') Obvious rewrites: A'*B' = (B * A )' (transposition push-up, including elementwise operators too) (A')' = A (transposition merge) cost based grouping (A*B) C versus A (B*C) special distributed algorithm versions for in-core operands and diagonal matrices
          Hide
          dlyubimov Dmitriy Lyubimov added a comment -

          WIP manual and working notes

          Show
          dlyubimov Dmitriy Lyubimov added a comment - WIP manual and working notes
          Hide
          dlyubimov Dmitriy Lyubimov added a comment -

          update

          Show
          dlyubimov Dmitriy Lyubimov added a comment - update
          Hide
          ssc Sebastian Schelter added a comment -

          Really looking forward to this. Once we have a "pipe" to Spark, I'll probably donate some network analysis code.

          Show
          ssc Sebastian Schelter added a comment - Really looking forward to this. Once we have a "pipe" to Spark, I'll probably donate some network analysis code.
          Hide
          dlyubimov Dmitriy Lyubimov added a comment -

          @Sebastian (and et al) could you please review if not the code then at least the API pdf (attached)? At this point i have all functional components to do distributed SSVD in dsl so it is really on the verge of commit, but i wouldn't want do that without no review at all (given how relatively big and conceptual this thing is).

          Show
          dlyubimov Dmitriy Lyubimov added a comment - @Sebastian (and et al) could you please review if not the code then at least the API pdf (attached)? At this point i have all functional components to do distributed SSVD in dsl so it is really on the verge of commit, but i wouldn't want do that without no review at all (given how relatively big and conceptual this thing is).
          Hide
          ssc Sebastian Schelter added a comment -

          Dmitriy, I read through your write-up and I have no words. I really love what you created and I'm looking forward to having this as part of Mahout. Would like to see our algorithms rewritten in your linear algebra dsl and executed (and optimized!) on Spark.

          Btw, be sure to also post your writeup to the spark mailinglist, I'm sure the guys there are also interested!

          Show
          ssc Sebastian Schelter added a comment - Dmitriy, I read through your write-up and I have no words. I really love what you created and I'm looking forward to having this as part of Mahout. Would like to see our algorithms rewritten in your linear algebra dsl and executed (and optimized!) on Spark. Btw, be sure to also post your writeup to the spark mailinglist, I'm sure the guys there are also interested!
          Hide
          dlyubimov Dmitriy Lyubimov added a comment -

          Ok this is finally done. SSVD is working, notes updated. I will commit it later tonight after additional review for misc stuff.

          Please look at the final pdf api , and source if needed.

          This will also contain fix for CholeskyDecomosition bug that always reports degenerate matrix.

          Show
          dlyubimov Dmitriy Lyubimov added a comment - Ok this is finally done. SSVD is working, notes updated. I will commit it later tonight after additional review for misc stuff. Please look at the final pdf api , and source if needed. This will also contain fix for CholeskyDecomosition bug that always reports degenerate matrix.
          Hide
          dlyubimov Dmitriy Lyubimov added a comment - - edited

          Most of this code is not distributed-tested. Unit tests do due diligence and ensure matrices are produced with more than a trivially single partition, and i also verified some stuff on a live single node spark but i haven't tried any significant datasets in a reall life cluster.

          Assumption is that we will have to continue working stuff out and gauge bottlenecks of concrete implementations. It is possible additional tuning parameters will be required, esp. for stuff that does blocking etc.

          So it should be marked as "evolving" .

          Show
          dlyubimov Dmitriy Lyubimov added a comment - - edited Most of this code is not distributed-tested. Unit tests do due diligence and ensure matrices are produced with more than a trivially single partition, and i also verified some stuff on a live single node spark but i haven't tried any significant datasets in a reall life cluster. Assumption is that we will have to continue working stuff out and gauge bottlenecks of concrete implementations. It is possible additional tuning parameters will be required, esp. for stuff that does blocking etc. So it should be marked as "evolving" .
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Mahout-Quality #2506 (See https://builds.apache.org/job/Mahout-Quality/2506/)
          MAHOUT-1346

          Squashed commit of the following:

          (...too long to list... ) (dlyubimov: rev 1575169)

          • /mahout/trunk/bin/mahout
          • /mahout/trunk/math-scala/pom.xml
          • /mahout/trunk/math-scala/src/main/scala/org/apache/mahout/math/scalabindings/MatrixOps.scala
          • /mahout/trunk/math-scala/src/main/scala/org/apache/mahout/math/scalabindings/RLikeMatrixOps.scala
          • /mahout/trunk/math-scala/src/main/scala/org/apache/mahout/math/scalabindings/SSVD.scala
          • /mahout/trunk/math-scala/src/main/scala/org/apache/mahout/math/scalabindings/package.scala
          • /mahout/trunk/math-scala/src/test/scala/org/apache/mahout/math/scalabindings/MathSuite.scala
          • /mahout/trunk/math/src/main/java/org/apache/mahout/math/CholeskyDecomposition.java
          • /mahout/trunk/pom.xml
          • /mahout/trunk/spark
          • /mahout/trunk/spark/pom.xml
          • /mahout/trunk/spark/src
          • /mahout/trunk/spark/src/main
          • /mahout/trunk/spark/src/main/scala
          • /mahout/trunk/spark/src/main/scala/org
          • /mahout/trunk/spark/src/main/scala/org/apache
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/blas
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/blas/ABt.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/blas/AewB.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/blas/AinCoreB.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/blas/At.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/blas/AtA.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/blas/AtB.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/blas/DrmRddOps.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/blas/Slicing.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/blas/package.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/CheckpointedDrm.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/CheckpointedDrmBase.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/CheckpointedOps.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/DrmLike.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/DrmLikeOps.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/DrmRddInput.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/RLikeDrmOps.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/decompositions
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/decompositions/DQR.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/decompositions/DSSVD.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/package.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/AbstractBinaryOp.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/AbstractUnaryOp.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/CheckpointAction.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/OpAB.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/OpABAnyKey.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/OpABt.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/OpAewB.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/OpAewScalar.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/OpAt.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/OpAtA.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/OpAtAnyKey.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/OpAtB.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/OpMapBlock.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/OpRowRange.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/OpTimesLeftMatrix.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/OpTimesRightMatrix.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/package.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/io
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/io/MahoutKryoRegistrator.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/io/WritableKryoSerializer.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/package.scala
          • /mahout/trunk/spark/src/test
          • /mahout/trunk/spark/src/test/scala
          • /mahout/trunk/spark/src/test/scala/org
          • /mahout/trunk/spark/src/test/scala/org/apache
          • /mahout/trunk/spark/src/test/scala/org/apache/mahout
          • /mahout/trunk/spark/src/test/scala/org/apache/mahout/sparkbindings
          • /mahout/trunk/spark/src/test/scala/org/apache/mahout/sparkbindings/blas
          • /mahout/trunk/spark/src/test/scala/org/apache/mahout/sparkbindings/blas/ABtSuite.scala
          • /mahout/trunk/spark/src/test/scala/org/apache/mahout/sparkbindings/blas/AewBSuite.scala
          • /mahout/trunk/spark/src/test/scala/org/apache/mahout/sparkbindings/blas/AtASuite.scala
          • /mahout/trunk/spark/src/test/scala/org/apache/mahout/sparkbindings/blas/AtSuite.scala
          • /mahout/trunk/spark/src/test/scala/org/apache/mahout/sparkbindings/drm
          • /mahout/trunk/spark/src/test/scala/org/apache/mahout/sparkbindings/drm/DrmLikeOpsSuite.scala
          • /mahout/trunk/spark/src/test/scala/org/apache/mahout/sparkbindings/drm/DrmLikeSuite.scala
          • /mahout/trunk/spark/src/test/scala/org/apache/mahout/sparkbindings/drm/RLikeDrmOpsSuite.scala
          • /mahout/trunk/spark/src/test/scala/org/apache/mahout/sparkbindings/drm/decompositions
          • /mahout/trunk/spark/src/test/scala/org/apache/mahout/sparkbindings/drm/decompositions/MathSuite.scala
          • /mahout/trunk/spark/src/test/scala/org/apache/mahout/sparkbindings/test
          • /mahout/trunk/spark/src/test/scala/org/apache/mahout/sparkbindings/test/LoggerConfiguration.scala
          • /mahout/trunk/spark/src/test/scala/org/apache/mahout/sparkbindings/test/MahoutLocalContext.scala
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Mahout-Quality #2506 (See https://builds.apache.org/job/Mahout-Quality/2506/ ) MAHOUT-1346 Squashed commit of the following: (...too long to list... ) (dlyubimov: rev 1575169) /mahout/trunk/bin/mahout /mahout/trunk/math-scala/pom.xml /mahout/trunk/math-scala/src/main/scala/org/apache/mahout/math/scalabindings/MatrixOps.scala /mahout/trunk/math-scala/src/main/scala/org/apache/mahout/math/scalabindings/RLikeMatrixOps.scala /mahout/trunk/math-scala/src/main/scala/org/apache/mahout/math/scalabindings/SSVD.scala /mahout/trunk/math-scala/src/main/scala/org/apache/mahout/math/scalabindings/package.scala /mahout/trunk/math-scala/src/test/scala/org/apache/mahout/math/scalabindings/MathSuite.scala /mahout/trunk/math/src/main/java/org/apache/mahout/math/CholeskyDecomposition.java /mahout/trunk/pom.xml /mahout/trunk/spark /mahout/trunk/spark/pom.xml /mahout/trunk/spark/src /mahout/trunk/spark/src/main /mahout/trunk/spark/src/main/scala /mahout/trunk/spark/src/main/scala/org /mahout/trunk/spark/src/main/scala/org/apache /mahout/trunk/spark/src/main/scala/org/apache/mahout /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/blas /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/blas/ABt.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/blas/AewB.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/blas/AinCoreB.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/blas/At.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/blas/AtA.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/blas/AtB.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/blas/DrmRddOps.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/blas/Slicing.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/blas/package.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/CheckpointedDrm.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/CheckpointedDrmBase.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/CheckpointedOps.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/DrmLike.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/DrmLikeOps.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/DrmRddInput.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/RLikeDrmOps.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/decompositions /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/decompositions/DQR.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/decompositions/DSSVD.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/package.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/AbstractBinaryOp.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/AbstractUnaryOp.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/CheckpointAction.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/OpAB.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/OpABAnyKey.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/OpABt.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/OpAewB.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/OpAewScalar.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/OpAt.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/OpAtA.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/OpAtAnyKey.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/OpAtB.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/OpMapBlock.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/OpRowRange.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/OpTimesLeftMatrix.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/OpTimesRightMatrix.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/package.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/io /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/io/MahoutKryoRegistrator.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/io/WritableKryoSerializer.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/package.scala /mahout/trunk/spark/src/test /mahout/trunk/spark/src/test/scala /mahout/trunk/spark/src/test/scala/org /mahout/trunk/spark/src/test/scala/org/apache /mahout/trunk/spark/src/test/scala/org/apache/mahout /mahout/trunk/spark/src/test/scala/org/apache/mahout/sparkbindings /mahout/trunk/spark/src/test/scala/org/apache/mahout/sparkbindings/blas /mahout/trunk/spark/src/test/scala/org/apache/mahout/sparkbindings/blas/ABtSuite.scala /mahout/trunk/spark/src/test/scala/org/apache/mahout/sparkbindings/blas/AewBSuite.scala /mahout/trunk/spark/src/test/scala/org/apache/mahout/sparkbindings/blas/AtASuite.scala /mahout/trunk/spark/src/test/scala/org/apache/mahout/sparkbindings/blas/AtSuite.scala /mahout/trunk/spark/src/test/scala/org/apache/mahout/sparkbindings/drm /mahout/trunk/spark/src/test/scala/org/apache/mahout/sparkbindings/drm/DrmLikeOpsSuite.scala /mahout/trunk/spark/src/test/scala/org/apache/mahout/sparkbindings/drm/DrmLikeSuite.scala /mahout/trunk/spark/src/test/scala/org/apache/mahout/sparkbindings/drm/RLikeDrmOpsSuite.scala /mahout/trunk/spark/src/test/scala/org/apache/mahout/sparkbindings/drm/decompositions /mahout/trunk/spark/src/test/scala/org/apache/mahout/sparkbindings/drm/decompositions/MathSuite.scala /mahout/trunk/spark/src/test/scala/org/apache/mahout/sparkbindings/test /mahout/trunk/spark/src/test/scala/org/apache/mahout/sparkbindings/test/LoggerConfiguration.scala /mahout/trunk/spark/src/test/scala/org/apache/mahout/sparkbindings/test/MahoutLocalContext.scala
          Hide
          dlyubimov Dmitriy Lyubimov added a comment -

          updating docs to reflect latest committed state.
          Brought in distributed and in-core stochastic PCA scripts, colmeans, colsums, drm-vector multiplication, more tests etc.etc. see the doc.

          Show
          dlyubimov Dmitriy Lyubimov added a comment - updating docs to reflect latest committed state. Brought in distributed and in-core stochastic PCA scripts, colmeans, colsums, drm-vector multiplication, more tests etc.etc. see the doc.
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Mahout-Quality #2526 (See https://builds.apache.org/job/Mahout-Quality/2526/)
          MAHOUT-1346: (D)-SPCA and other additions and fixes.

          Squashed commit of the following:

          commit 1ed6267a3cc550d1d648346fca7e152c63909667
          Merge: 95da094 c0260b5
          Author: Dmitriy Lyubimov <dlyubimov@apache.org>
          Date: Mon Mar 17 11:38:33 2014 -0700

          Merge branch 'trunk' into dev-1.0-spark

          commit 95da094a38d7fbbd654e95bbfb5723c5e1e48c55
          Author: Dmitriy Lyubimov <dlyubimov@apache.org>
          Date: Mon Mar 17 11:24:28 2014 -0700

          D-SPCA fixes. test is passing (in-core and out-of-core tests give the same result). However, the test seems to be still rank-deficient (rank=20 instead of desired 100). Not sure why – the Cholesky seems to be too sensitive.

          commit c907be0f9b35436cf623e1a92941f924382b531d
          Author: Dmitriy Lyubimov <dlyubimov@apache.org>
          Date: Mon Mar 17 00:10:05 2014 -0700

          Fixes for D-SPCA: q=0 works, q>0 still doesn't

          commit 0e24fe88cbbae2db57c06b4d85f5df1b5a2925e5
          Author: Dmitriy Lyubimov <dlyubimov@apache.org>
          Date: Sun Mar 16 18:16:27 2014 -0700

          added test for colSums, colMeans

          commit 1012d65ba360cfe944f2f559cf032c7546e19072
          Author: Dmitriy Lyubimov <dlyubimov@apache.org>
          Date: Sun Mar 16 18:08:25 2014 -0700

          D-SPCA WIP. Test is not yet working with -q0

          commit 5560f1e928380280b23dbd35efec39b8566eb1a3
          Author: Dmitriy Lyubimov <dlyubimov@apache.org>
          Date: Sun Mar 9 18:12:03 2014 -0700

          Fixng random gen in SPCA test codegen

          commit 740cac1eec6a4e1bf256066d86b7b5681f88bd4c
          Author: Dmitriy Lyubimov <dlyubimov@apache.org>
          Date: Sun Mar 9 18:08:30 2014 -0700

          removing check for rank deficiency (so pca can complete). User can check the results for that if needed.
          Adding in-core s-pca test.

          commit f1abfe430987b60a62a0e65cf234ebdc40135275
          Author: Dmitriy Lyubimov <dlyubimov@apache.org>
          Date: Sun Mar 9 16:19:32 2014 -0700

          perhaps a better ssvd test assertion

          commit e26596fd6cecb74055c38d838a8b02e9698b17f8
          Author: Dmitriy Lyubimov <dlyubimov@apache.org>
          Date: Sat Mar 8 22:01:58 2014 -0800

          first write-up of in-core stochastic PCA

          commit eb3bf98d6abcd8f480e892c74bd7f61b32b33bdf
          Author: Dmitriy Lyubimov <dlyubimov@apache.org>
          Date: Sat Mar 8 20:35:24 2014 -0800

          rowsums, colsums, rowmeans, colmeans for in-core + tests (dlyubimov: rev 1578527)

          • /mahout/trunk/math-scala/pom.xml
          • /mahout/trunk/math-scala/src/main/scala/org/apache/mahout/math/scalabindings/MatrixOps.scala
          • /mahout/trunk/math-scala/src/main/scala/org/apache/mahout/math/scalabindings/SSVD.scala
          • /mahout/trunk/math-scala/src/main/scala/org/apache/mahout/math/scalabindings/VectorOps.scala
          • /mahout/trunk/math-scala/src/main/scala/org/apache/mahout/math/scalabindings/package.scala
          • /mahout/trunk/math-scala/src/test/scala/org/apache/mahout/math/scalabindings/MathSuite.scala
          • /mahout/trunk/math-scala/src/test/scala/org/apache/mahout/math/scalabindings/MatlabLikeMatrixOpsSuite.scala
          • /mahout/trunk/math-scala/src/test/scala/org/apache/mahout/math/scalabindings/MatrixOpsSuite.scala
          • /mahout/trunk/math-scala/src/test/scala/org/apache/mahout/math/scalabindings/RLikeMatrixOpsSuite.scala
          • /mahout/trunk/math-scala/src/test/scala/org/apache/mahout/math/scalabindings/RLikeVectorOpsSuite.scala
          • /mahout/trunk/math-scala/src/test/scala/org/apache/mahout/math/scalabindings/VectorOpsSuite.scala
          • /mahout/trunk/math-scala/src/test/scala/org/apache/mahout/test
          • /mahout/trunk/math-scala/src/test/scala/org/apache/mahout/test/LoggerConfiguration.scala
          • /mahout/trunk/math-scala/src/test/scala/org/apache/mahout/test/MahoutSuite.scala
          • /mahout/trunk/spark/pom.xml
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/blas/AinCoreB.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/blas/AtB.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/blas/Ax.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/CheckpointedDrm.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/CheckpointedDrmBase.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/CheckpointedOps.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/DrmLike.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/RLikeDrmOps.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/decompositions/DQR.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/decompositions/DSPCA.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/decompositions/DSSVD.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/package.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/CheckpointAction.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/OpAewB.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/OpAewScalar.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/OpAt.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/OpAtAnyKey.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/OpAtx.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/OpAx.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/OpMapBlock.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/OpRowRange.scala
          • /mahout/trunk/spark/src/test/scala/org/apache/mahout/sparkbindings/drm/DrmLikeOpsSuite.scala
          • /mahout/trunk/spark/src/test/scala/org/apache/mahout/sparkbindings/drm/RLikeDrmOpsSuite.scala
          • /mahout/trunk/spark/src/test/scala/org/apache/mahout/sparkbindings/drm/decompositions/MathSuite.scala
          • /mahout/trunk/spark/src/test/scala/org/apache/mahout/sparkbindings/test/LoggerConfiguration.scala
          • /mahout/trunk/spark/src/test/scala/org/apache/mahout/sparkbindings/test/MahoutLocalContext.scala
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Mahout-Quality #2526 (See https://builds.apache.org/job/Mahout-Quality/2526/ ) MAHOUT-1346 : (D)-SPCA and other additions and fixes. Squashed commit of the following: commit 1ed6267a3cc550d1d648346fca7e152c63909667 Merge: 95da094 c0260b5 Author: Dmitriy Lyubimov <dlyubimov@apache.org> Date: Mon Mar 17 11:38:33 2014 -0700 Merge branch 'trunk' into dev-1.0-spark commit 95da094a38d7fbbd654e95bbfb5723c5e1e48c55 Author: Dmitriy Lyubimov <dlyubimov@apache.org> Date: Mon Mar 17 11:24:28 2014 -0700 D-SPCA fixes. test is passing (in-core and out-of-core tests give the same result). However, the test seems to be still rank-deficient (rank=20 instead of desired 100). Not sure why – the Cholesky seems to be too sensitive. commit c907be0f9b35436cf623e1a92941f924382b531d Author: Dmitriy Lyubimov <dlyubimov@apache.org> Date: Mon Mar 17 00:10:05 2014 -0700 Fixes for D-SPCA: q=0 works, q>0 still doesn't commit 0e24fe88cbbae2db57c06b4d85f5df1b5a2925e5 Author: Dmitriy Lyubimov <dlyubimov@apache.org> Date: Sun Mar 16 18:16:27 2014 -0700 added test for colSums, colMeans commit 1012d65ba360cfe944f2f559cf032c7546e19072 Author: Dmitriy Lyubimov <dlyubimov@apache.org> Date: Sun Mar 16 18:08:25 2014 -0700 D-SPCA WIP. Test is not yet working with -q0 commit 5560f1e928380280b23dbd35efec39b8566eb1a3 Author: Dmitriy Lyubimov <dlyubimov@apache.org> Date: Sun Mar 9 18:12:03 2014 -0700 Fixng random gen in SPCA test codegen commit 740cac1eec6a4e1bf256066d86b7b5681f88bd4c Author: Dmitriy Lyubimov <dlyubimov@apache.org> Date: Sun Mar 9 18:08:30 2014 -0700 removing check for rank deficiency (so pca can complete). User can check the results for that if needed. Adding in-core s-pca test. commit f1abfe430987b60a62a0e65cf234ebdc40135275 Author: Dmitriy Lyubimov <dlyubimov@apache.org> Date: Sun Mar 9 16:19:32 2014 -0700 perhaps a better ssvd test assertion commit e26596fd6cecb74055c38d838a8b02e9698b17f8 Author: Dmitriy Lyubimov <dlyubimov@apache.org> Date: Sat Mar 8 22:01:58 2014 -0800 first write-up of in-core stochastic PCA commit eb3bf98d6abcd8f480e892c74bd7f61b32b33bdf Author: Dmitriy Lyubimov <dlyubimov@apache.org> Date: Sat Mar 8 20:35:24 2014 -0800 rowsums, colsums, rowmeans, colmeans for in-core + tests (dlyubimov: rev 1578527) /mahout/trunk/math-scala/pom.xml /mahout/trunk/math-scala/src/main/scala/org/apache/mahout/math/scalabindings/MatrixOps.scala /mahout/trunk/math-scala/src/main/scala/org/apache/mahout/math/scalabindings/SSVD.scala /mahout/trunk/math-scala/src/main/scala/org/apache/mahout/math/scalabindings/VectorOps.scala /mahout/trunk/math-scala/src/main/scala/org/apache/mahout/math/scalabindings/package.scala /mahout/trunk/math-scala/src/test/scala/org/apache/mahout/math/scalabindings/MathSuite.scala /mahout/trunk/math-scala/src/test/scala/org/apache/mahout/math/scalabindings/MatlabLikeMatrixOpsSuite.scala /mahout/trunk/math-scala/src/test/scala/org/apache/mahout/math/scalabindings/MatrixOpsSuite.scala /mahout/trunk/math-scala/src/test/scala/org/apache/mahout/math/scalabindings/RLikeMatrixOpsSuite.scala /mahout/trunk/math-scala/src/test/scala/org/apache/mahout/math/scalabindings/RLikeVectorOpsSuite.scala /mahout/trunk/math-scala/src/test/scala/org/apache/mahout/math/scalabindings/VectorOpsSuite.scala /mahout/trunk/math-scala/src/test/scala/org/apache/mahout/test /mahout/trunk/math-scala/src/test/scala/org/apache/mahout/test/LoggerConfiguration.scala /mahout/trunk/math-scala/src/test/scala/org/apache/mahout/test/MahoutSuite.scala /mahout/trunk/spark/pom.xml /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/blas/AinCoreB.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/blas/AtB.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/blas/Ax.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/CheckpointedDrm.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/CheckpointedDrmBase.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/CheckpointedOps.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/DrmLike.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/RLikeDrmOps.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/decompositions/DQR.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/decompositions/DSPCA.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/decompositions/DSSVD.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/package.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/CheckpointAction.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/OpAewB.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/OpAewScalar.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/OpAt.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/OpAtAnyKey.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/OpAtx.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/OpAx.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/OpMapBlock.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/plan/OpRowRange.scala /mahout/trunk/spark/src/test/scala/org/apache/mahout/sparkbindings/drm/DrmLikeOpsSuite.scala /mahout/trunk/spark/src/test/scala/org/apache/mahout/sparkbindings/drm/RLikeDrmOpsSuite.scala /mahout/trunk/spark/src/test/scala/org/apache/mahout/sparkbindings/drm/decompositions/MathSuite.scala /mahout/trunk/spark/src/test/scala/org/apache/mahout/sparkbindings/test/LoggerConfiguration.scala /mahout/trunk/spark/src/test/scala/org/apache/mahout/sparkbindings/test/MahoutLocalContext.scala
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Mahout-Quality #2527 (See https://builds.apache.org/job/Mahout-Quality/2527/)
          MAHOUT-1346: better PCA test & input generator (dlyubimov: rev 1578663)

          • /mahout/trunk/math-scala/src/test/scala/org/apache/mahout/math/scalabindings/MathSuite.scala
          • /mahout/trunk/spark/src/test/scala/org/apache/mahout/sparkbindings/drm/decompositions/MathSuite.scala
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Mahout-Quality #2527 (See https://builds.apache.org/job/Mahout-Quality/2527/ ) MAHOUT-1346 : better PCA test & input generator (dlyubimov: rev 1578663) /mahout/trunk/math-scala/src/test/scala/org/apache/mahout/math/scalabindings/MathSuite.scala /mahout/trunk/spark/src/test/scala/org/apache/mahout/sparkbindings/drm/decompositions/MathSuite.scala
          Hide
          dlyubimov Dmitriy Lyubimov added a comment -

          Actually, non-slim A'A operator is practically A'B without need for a zip... So we are almost done, the biggest work here is the test I suppose.

          Show
          dlyubimov Dmitriy Lyubimov added a comment - Actually, non-slim A'A operator is practically A'B without need for a zip... So we are almost done, the biggest work here is the test I suppose.
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Mahout-Quality #2538 (See https://builds.apache.org/job/Mahout-Quality/2538/)
          MAHOUT-1346 a bit more syntactically pallatable correction encoding for BB' (dlyubimov: rev 1581012)

          • /mahout/trunk/math-scala/src/main/scala/org/apache/mahout/math/scalabindings/SSVD.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/decompositions/DSPCA.scala
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Mahout-Quality #2538 (See https://builds.apache.org/job/Mahout-Quality/2538/ ) MAHOUT-1346 a bit more syntactically pallatable correction encoding for BB' (dlyubimov: rev 1581012) /mahout/trunk/math-scala/src/main/scala/org/apache/mahout/math/scalabindings/SSVD.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/decompositions/DSPCA.scala
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Mahout-Quality #2539 (See https://builds.apache.org/job/Mahout-Quality/2539/)
          MAHOUT-1346 style: removing my weird log variable convention, Mahout doesn't use that (dlyubimov: rev 1582021)

          • /mahout/trunk/math-scala/src/main/scala/org/apache/mahout/math/scalabindings/SSVD.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/blas/AtA.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/blas/AtB.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/decompositions/DQR.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/package.scala
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/package.scala
          • /mahout/trunk/spark/src/test/scala/org/apache/mahout/sparkbindings/drm/RLikeDrmOpsSuite.scala
            MAHOUT-1346: don't evaluate debug printout in no-debug mode (dlyubimov: rev 1582020)
          • /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/decompositions/DQR.scala
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Mahout-Quality #2539 (See https://builds.apache.org/job/Mahout-Quality/2539/ ) MAHOUT-1346 style: removing my weird log variable convention, Mahout doesn't use that (dlyubimov: rev 1582021) /mahout/trunk/math-scala/src/main/scala/org/apache/mahout/math/scalabindings/SSVD.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/blas/AtA.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/blas/AtB.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/decompositions/DQR.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/package.scala /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/package.scala /mahout/trunk/spark/src/test/scala/org/apache/mahout/sparkbindings/drm/RLikeDrmOpsSuite.scala MAHOUT-1346 : don't evaluate debug printout in no-debug mode (dlyubimov: rev 1582020) /mahout/trunk/spark/src/main/scala/org/apache/mahout/sparkbindings/decompositions/DQR.scala
          Hide
          dlyubimov Dmitriy Lyubimov added a comment -

          Added component stack diagram.

          Show
          dlyubimov Dmitriy Lyubimov added a comment - Added component stack diagram.

            People

            • Assignee:
              dlyubimov Dmitriy Lyubimov
              Reporter:
              dlyubimov Dmitriy Lyubimov
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development