Mahout
  1. Mahout
  2. MAHOUT-792

Add new stochastic decomposition code

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.6
    • Component/s: None
    • Labels:
      None

      Description

      I have figured out some simplification for our SSVD algorithms. This eliminates the QR decomposition and makes life easier.

      I will produce a patch that contains the following:

      • a CholeskyDecomposition implementation that does pivoting (and thus rank-revealing) or not. This should actually be useful for solution of large out-of-core least squares problems.
      • an in-memory SSVD implementation that should work for matrices up to about 1/3 of available memory.
      • an out-of-core SSVD threaded implementation that should work for very large matrices. It should take time about equal to the cost of reading the input matrix 4 times and will require working disk roughly equal to the size of the input.
      1. MAHOUT-792.patch
        211 kB
        Ted Dunning
      2. MAHOUT-792.patch
        231 kB
        Ted Dunning
      3. sd-2.pdf
        142 kB
        Ted Dunning

        Issue Links

          Activity

          Hide
          Ted Dunning added a comment -

          This is the current state. This has an in-memory implementation and an out-of-core implementation for review. The out-of-core implementation needs to be changed to use threads well before it is production ready.

          Show
          Ted Dunning added a comment - This is the current state. This has an in-memory implementation and an out-of-core implementation for review. The out-of-core implementation needs to be changed to use threads well before it is production ready.
          Hide
          Ted Dunning added a comment -

          This presumes MAHOUT-790 and MAHOUT-793 and includes them in this patch.

          Show
          Ted Dunning added a comment - This presumes MAHOUT-790 and MAHOUT-793 and includes them in this patch.
          Hide
          Ted Dunning added a comment -

          Github branch MAHOUT-792 available at git://github.com/tdunning/mahout.git

          That provides more details on successive changes.

          Show
          Ted Dunning added a comment - Github branch MAHOUT-792 available at git://github.com/tdunning/mahout.git That provides more details on successive changes.
          Hide
          Ted Dunning added a comment -

          Updated with small fixes. The tests are enough better now that I am beginning to believe that this might be correct.

          Github has been updated as well.

          Show
          Ted Dunning added a comment - Updated with small fixes. The tests are enough better now that I am beginning to believe that this might be correct. Github has been updated as well.
          Hide
          Dmitriy Lyubimov added a comment -

          a-- It looks like it affects a lot of not very relevant stuff? (GraidentMachine, AbstractClassifier files)

          b-- Ted, are you going to implement MR version? do you have a parallelization strategy?

          Thanks.

          Show
          Dmitriy Lyubimov added a comment - a-- It looks like it affects a lot of not very relevant stuff? (GraidentMachine, AbstractClassifier files) b-- Ted, are you going to implement MR version? do you have a parallelization strategy? Thanks.
          Hide
          Ted Dunning added a comment -

          On (a), the patch includes MAHOUT-790 since it depends on it. When I commit that JIRA, it will be easier to read.

          On (b), see the paper. Yes.

          Show
          Ted Dunning added a comment - On (a), the patch includes MAHOUT-790 since it depends on it. When I commit that JIRA, it will be easier to read. On (b), see the paper. Yes.
          Hide
          Dmitriy Lyubimov added a comment -

          This looks like a very promising shortcut for MR operationally.

          AFAICT, you suggest to compute Y'Y and then derive R^-1 out of it and use it in another pass to derive B. Y'Y is indeed operationally much easier to accumulate and produce with MR in the same projection step.

          One thing that i don't understand is how you are proposing to derive this R^-1. To do Cholesky on Y'Y, take transpose and inverse and declare it R^-1?

          Show
          Dmitriy Lyubimov added a comment - This looks like a very promising shortcut for MR operationally. AFAICT, you suggest to compute Y'Y and then derive R^-1 out of it and use it in another pass to derive B. Y'Y is indeed operationally much easier to accumulate and produce with MR in the same projection step. One thing that i don't understand is how you are proposing to derive this R^-1. To do Cholesky on Y'Y, take transpose and inverse and declare it R^-1?
          Hide
          Dmitriy Lyubimov added a comment - - edited

          more thoughts:

          1 – it looks like this method is compressing a lot of info into k+p upper-triangular R. I guess there's more potential for rounding errors there. (I am not buying too much into rounding errors though).

          2 – Argument of Q matrix being large is not very convincing to me. Q is m x (k+p), that is, assuming A is much wider (in terms of non-zero element average #) which it should be in order for projection method to make sense at all, it is a fraction of A. Besides, it is written in blocks by mappers, and read by mappers, so each mapper sees only one block of size say 30,000 x (k+p), which it will not be, assuming A is sufficiently wide, and assuming k+p=500, amounts to 120mb per mapper. So there's not so much actual memory pressure, that's what distributed computations are for. Same goes for dense operations, we just distribute them.

          3 – it looks like you are writing B as output of the second MR, which is (k+p) x n. Going back to argument of a 'big Q', we can't assume that B would be any less. In fact, some time ago i came to conclusion that it looks like projection method would be much more efficient if input is wide rather than tall since projection compression factor would be much better for what seems to be fairly inexpensive operation (since it costs nothing to redistribute Omega which is only backed by one long number as a random gen seed, as opposed to actual long or wide bands such as B or Q). So we can't exclude very wide inputs.

          Overall it looks like a great improvement. I am not convinced that it would cut processing time that much (it looks it has the same amount of proposed MR steps but it looks like all of them would require shuffle-and-sort and reduce phase, whereas with QR is a reduceless process), but it definitely reduces complexity of MR implementation and i would be very eager to try it out. Certainly all i said is the first impression and intuition only; and in my experience intuition turns out to be wrong surprisingly often as far as benchmark guesstimates are concerned.

          Show
          Dmitriy Lyubimov added a comment - - edited more thoughts: 1 – it looks like this method is compressing a lot of info into k+p upper-triangular R. I guess there's more potential for rounding errors there. (I am not buying too much into rounding errors though). 2 – Argument of Q matrix being large is not very convincing to me. Q is m x (k+p), that is, assuming A is much wider (in terms of non-zero element average #) which it should be in order for projection method to make sense at all, it is a fraction of A. Besides, it is written in blocks by mappers, and read by mappers, so each mapper sees only one block of size say 30,000 x (k+p), which it will not be, assuming A is sufficiently wide, and assuming k+p=500, amounts to 120mb per mapper. So there's not so much actual memory pressure, that's what distributed computations are for. Same goes for dense operations, we just distribute them. 3 – it looks like you are writing B as output of the second MR, which is (k+p) x n. Going back to argument of a 'big Q', we can't assume that B would be any less. In fact, some time ago i came to conclusion that it looks like projection method would be much more efficient if input is wide rather than tall since projection compression factor would be much better for what seems to be fairly inexpensive operation (since it costs nothing to redistribute Omega which is only backed by one long number as a random gen seed, as opposed to actual long or wide bands such as B or Q). So we can't exclude very wide inputs. Overall it looks like a great improvement. I am not convinced that it would cut processing time that much (it looks it has the same amount of proposed MR steps but it looks like all of them would require shuffle-and-sort and reduce phase, whereas with QR is a reduceless process), but it definitely reduces complexity of MR implementation and i would be very eager to try it out. Certainly all i said is the first impression and intuition only; and in my experience intuition turns out to be wrong surprisingly often as far as benchmark guesstimates are concerned.
          Hide
          Ted Dunning added a comment -

          One thing that i don't understand is how you are proposing to derive this R^-1. To do Cholesky on Y'Y, take transpose and inverse and declare it R^-1?

          This is just a notational deficiency. The CholeskyDecomposition has solveLeft and solveRight methods that do use back or forward substitution to effectively right or left multiply by the inverse of R (or R'). I find my current naming to be confusing and think that it might be better to have the methods timesLinv, timesRinv and leftTimesLinv and leftTimesRinv or some such because that would match the actual mathematical exposition better. If you have a good idea for naming, please suggest it.

          Show
          Ted Dunning added a comment - One thing that i don't understand is how you are proposing to derive this R^-1. To do Cholesky on Y'Y, take transpose and inverse and declare it R^-1? This is just a notational deficiency. The CholeskyDecomposition has solveLeft and solveRight methods that do use back or forward substitution to effectively right or left multiply by the inverse of R (or R'). I find my current naming to be confusing and think that it might be better to have the methods timesLinv, timesRinv and leftTimesLinv and leftTimesRinv or some such because that would match the actual mathematical exposition better. If you have a good idea for naming, please suggest it.
          Hide
          Ted Dunning added a comment -

          1 – it looks like this method is compressing a lot of info into k+p upper-triangular R. I guess there's more potential for rounding errors there. (I am not buying too much into rounding errors though).

          This is absolutely true. On the other hand, this is only really putting as much information into that matrix as we ultimately want back out (i.e. we are reducing dimension here).

          2 – Argument of Q matrix being large is not very convincing to me. Q is m x (k+p), that is, assuming A is much wider (in terms of non-zero element average #) which it should be in order for projection method to make sense at all, it is a fraction of A.

          But A is sparse so Y, B and Q are all about the same (storage) size as A. In fact, if k+p > average number of elements per row of A, then Y, B and Q will all be larger than A.

          Besides, it is written in blocks by mappers, and read by mappers, so each mapper sees only one block of size say 30,000 x (k+p), which it will not be, assuming A is sufficiently wide, and assuming k+p=500, amounts to 120mb per mapper. So there's not so much actual memory pressure, that's what distributed computations are for. Same goes for dense operations, we just distribute them.

          Of course. But using the Cholesky trick means that Q'A can be done by reading only A instead of Q and A.

          3 – it looks like you are writing B as output of the second MR, which is (k+p) x n. Going back to argument of a 'big Q', we can't assume that B would be any less.

          I completely agree. B is too big to store all in memory and could easily be comparable to or larger than A.

          In fact, some time ago i came to conclusion that it looks like projection method would be much more efficient if input is wide rather than tall since projection compression factor would be much better for what seems to be fairly inexpensive operation (since it costs nothing to redistribute Omega which is only backed by one long number as a random gen seed, as opposed to actual long or wide bands such as B or Q). So we can't exclude very wide inputs.

          Indeed.

          The common web-case could easily wind up that way. You might have a million users exploring a billion web pages.

          Overall it looks like a great improvement. I am not convinced that it would cut processing time that much (it looks it has the same amount of proposed MR steps but it looks like all of them would require shuffle-and-sort and reduce phase, whereas with QR is a reduceless process), but it definitely reduces complexity of MR implementation

          This is might thought exactly. It also decreases the memory load of in-memory implementations which is good. But as you say later, intuition is extraordinarily dangerous for run-times.

          Show
          Ted Dunning added a comment - 1 – it looks like this method is compressing a lot of info into k+p upper-triangular R. I guess there's more potential for rounding errors there. (I am not buying too much into rounding errors though). This is absolutely true. On the other hand, this is only really putting as much information into that matrix as we ultimately want back out (i.e. we are reducing dimension here). 2 – Argument of Q matrix being large is not very convincing to me. Q is m x (k+p), that is, assuming A is much wider (in terms of non-zero element average #) which it should be in order for projection method to make sense at all, it is a fraction of A. But A is sparse so Y, B and Q are all about the same (storage) size as A. In fact, if k+p > average number of elements per row of A, then Y, B and Q will all be larger than A. Besides, it is written in blocks by mappers, and read by mappers, so each mapper sees only one block of size say 30,000 x (k+p), which it will not be, assuming A is sufficiently wide, and assuming k+p=500, amounts to 120mb per mapper. So there's not so much actual memory pressure, that's what distributed computations are for. Same goes for dense operations, we just distribute them. Of course. But using the Cholesky trick means that Q'A can be done by reading only A instead of Q and A. 3 – it looks like you are writing B as output of the second MR, which is (k+p) x n. Going back to argument of a 'big Q', we can't assume that B would be any less. I completely agree. B is too big to store all in memory and could easily be comparable to or larger than A. In fact, some time ago i came to conclusion that it looks like projection method would be much more efficient if input is wide rather than tall since projection compression factor would be much better for what seems to be fairly inexpensive operation (since it costs nothing to redistribute Omega which is only backed by one long number as a random gen seed, as opposed to actual long or wide bands such as B or Q). So we can't exclude very wide inputs. Indeed. The common web-case could easily wind up that way. You might have a million users exploring a billion web pages. Overall it looks like a great improvement. I am not convinced that it would cut processing time that much (it looks it has the same amount of proposed MR steps but it looks like all of them would require shuffle-and-sort and reduce phase, whereas with QR is a reduceless process), but it definitely reduces complexity of MR implementation This is might thought exactly. It also decreases the memory load of in-memory implementations which is good. But as you say later, intuition is extraordinarily dangerous for run-times.
          Hide
          Dmitriy Lyubimov added a comment -

          But A is sparse so Y, B and Q are all about the same (storage) size as A. In fact, if k+p > average number of elements per row of A, then Y, B and Q will all be larger than A.

          That's true. I had one guy who had millions of rows but only 10 average measurements per row. Probably ratings or something. it is not going to be efficient (cpu-wise) in these cases.

          But my response has always been, if you have input thinner than projection, then why even use projection? It's my understanding that the whole idea of this is to drastically reduce the input to analyze. Original paper actually never suggested to compute BB' as far as i remember, it's something i did to open up the n by sacrificing rounding. In original paper, they compute SVD of B, so if B is greater than input, it would only cost more to compute SVD of one. So that's what i understood – B is supposed to be much less than A in original work, otherwise there's not much sense.

          Show
          Dmitriy Lyubimov added a comment - But A is sparse so Y, B and Q are all about the same (storage) size as A. In fact, if k+p > average number of elements per row of A, then Y, B and Q will all be larger than A. That's true. I had one guy who had millions of rows but only 10 average measurements per row. Probably ratings or something. it is not going to be efficient (cpu-wise) in these cases. But my response has always been, if you have input thinner than projection, then why even use projection? It's my understanding that the whole idea of this is to drastically reduce the input to analyze. Original paper actually never suggested to compute BB' as far as i remember, it's something i did to open up the n by sacrificing rounding. In original paper, they compute SVD of B, so if B is greater than input, it would only cost more to compute SVD of one. So that's what i understood – B is supposed to be much less than A in original work, otherwise there's not much sense.
          Hide
          Ted Dunning added a comment -

          But my response has always been, if you have input thinner than projection, then why even use projection?

          Because it is cheaper to process a thin dense matrix than a wide sparse one.

          Likewise, even if B is bigger than A, it can be much cheaper to compute the SVD of B than of A if only because the Cholesky trick works on the skinny dense case. If the Cholesky decomposition loses some accuracy then a QR or LQ decomposition could be used the same way.

          Show
          Ted Dunning added a comment - But my response has always been, if you have input thinner than projection, then why even use projection? Because it is cheaper to process a thin dense matrix than a wide sparse one. Likewise, even if B is bigger than A, it can be much cheaper to compute the SVD of B than of A if only because the Cholesky trick works on the skinny dense case. If the Cholesky decomposition loses some accuracy then a QR or LQ decomposition could be used the same way.
          Hide
          Dmitriy Lyubimov added a comment -

          Ted,
          what's degree of stability of the CholeskyDecomposition code? I am getting test failure

          Failed tests: 
            testSingularValues(org.apache.mahout.math.ssvd.SequentialOutOfCoreSvdTest): expected:<0.0> but was:<5250028.000000007>
          
          Show
          Dmitriy Lyubimov added a comment - Ted, what's degree of stability of the CholeskyDecomposition code? I am getting test failure Failed tests: testSingularValues(org.apache.mahout.math.ssvd.SequentialOutOfCoreSvdTest): expected:<0.0> but was:<5250028.000000007>
          Hide
          Ted Dunning added a comment -

          It should be very good.

          I have a few corrections that I need to post. That might be the cause of
          your problem.

          On Saturday, August 27, 2011, Dmitriy Lyubimov (JIRA) <jira@apache.org>
          https://issues.apache.org/jira/browse/MAHOUT-792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13092404#comment-13092404]
          test failure
          testSingularValues(org.apache.mahout.math.ssvd.SequentialOutOfCoreSvdTest):
          expected:<0.0> but was:<5250028.000000007>
          eliminates the QR decomposition and makes life easier.
          rank-revealing) or not. This should actually be useful for solution of
          large out-of-core least squares problems.
          about 1/3 of available memory.
          large matrices. It should take time about equal to the cost of reading the
          input matrix 4 times and will require working disk roughly equal to the size
          of the input.

          Show
          Ted Dunning added a comment - It should be very good. I have a few corrections that I need to post. That might be the cause of your problem. On Saturday, August 27, 2011, Dmitriy Lyubimov (JIRA) <jira@apache.org> https://issues.apache.org/jira/browse/MAHOUT-792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13092404#comment-13092404 ] test failure testSingularValues(org.apache.mahout.math.ssvd.SequentialOutOfCoreSvdTest): expected:<0.0> but was:<5250028.000000007> eliminates the QR decomposition and makes life easier. rank-revealing) or not. This should actually be useful for solution of large out-of-core least squares problems. about 1/3 of available memory. large matrices. It should take time about equal to the cost of reading the input matrix 4 times and will require working disk roughly equal to the size of the input.
          Hide
          Hudson added a comment -

          Integrated in Mahout-Quality #1012 (See https://builds.apache.org/job/Mahout-Quality/1012/)
          MAHOUT-790 - kill the cardinality array and size() for matrices. Use rowSize() and columnSize() instead.

          MAHOUT-792 - Fix RTM to avoid size() and cardinality array.

          tdunning : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1164016
          Files :

          • /mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/als/eval/InMemoryFactorizationEvaluator.java
          • /mahout/trunk/core/src/main/java/org/apache/mahout/classifier/bayes/InMemoryBayesDatastore.java
          • /mahout/trunk/core/src/main/java/org/apache/mahout/classifier/discriminative/LinearTrainer.java
          • /mahout/trunk/core/src/main/java/org/apache/mahout/classifier/naivebayes/NaiveBayesModel.java
          • /mahout/trunk/core/src/main/java/org/apache/mahout/classifier/naivebayes/training/TrainUtils.java
          • /mahout/trunk/core/src/main/java/org/apache/mahout/classifier/sequencelearning/hmm/HmmUtils.java
          • /mahout/trunk/core/src/main/java/org/apache/mahout/math/MatrixWritable.java
          • /mahout/trunk/core/src/main/java/org/apache/mahout/math/hadoop/decomposer/EigenVerificationJob.java
          • /mahout/trunk/core/src/main/java/org/apache/mahout/math/hadoop/stochasticsvd/UpperTriangular.java
          • /mahout/trunk/core/src/test/java/org/apache/mahout/cf/taste/hadoop/als/ParallelALSFactorizationJobTest.java
          • /mahout/trunk/math/src/main/java/org/apache/mahout/math/AbstractMatrix.java
          • /mahout/trunk/math/src/main/java/org/apache/mahout/math/DenseMatrix.java
          • /mahout/trunk/math/src/main/java/org/apache/mahout/math/DiagonalMatrix.java
          • /mahout/trunk/math/src/main/java/org/apache/mahout/math/Matrix.java
          • /mahout/trunk/math/src/main/java/org/apache/mahout/math/MatrixView.java
          • /mahout/trunk/math/src/main/java/org/apache/mahout/math/PermutedVectorView.java
          • /mahout/trunk/math/src/main/java/org/apache/mahout/math/PivotedMatrix.java
          • /mahout/trunk/math/src/main/java/org/apache/mahout/math/RandomAccessSparseVector.java
          • /mahout/trunk/math/src/main/java/org/apache/mahout/math/RandomTrinaryMatrix.java
          • /mahout/trunk/math/src/main/java/org/apache/mahout/math/SequentialAccessSparseVector.java
          • /mahout/trunk/math/src/main/java/org/apache/mahout/math/SparseColumnMatrix.java
          • /mahout/trunk/math/src/main/java/org/apache/mahout/math/SparseMatrix.java
          • /mahout/trunk/math/src/main/java/org/apache/mahout/math/SparseRowMatrix.java
          • /mahout/trunk/math/src/main/java/org/apache/mahout/math/decomposer/hebbian/HebbianSolver.java
          • /mahout/trunk/math/src/test/java/org/apache/mahout/math/AbstractTestVector.java
          • /mahout/trunk/math/src/test/java/org/apache/mahout/math/MatrixTest.java
          • /mahout/trunk/math/src/test/java/org/apache/mahout/math/TestMatrixView.java
          • /mahout/trunk/math/src/test/java/org/apache/mahout/math/TestSparseColumnMatrix.java
          • /mahout/trunk/math/src/test/java/org/apache/mahout/math/TestSparseMatrix.java
          • /mahout/trunk/math/src/test/java/org/apache/mahout/math/TestSparseRowMatrix.java
          • /mahout/trunk/math/src/test/java/org/apache/mahout/math/TestVectorView.java
          • /mahout/trunk/math/src/test/java/org/apache/mahout/math/als/AlternateLeastSquaresSolverTest.java
          • /mahout/trunk/math/src/test/java/org/apache/mahout/math/decomposer/SolverTest.java
          Show
          Hudson added a comment - Integrated in Mahout-Quality #1012 (See https://builds.apache.org/job/Mahout-Quality/1012/ ) MAHOUT-790 - kill the cardinality array and size() for matrices. Use rowSize() and columnSize() instead. MAHOUT-792 - Fix RTM to avoid size() and cardinality array. tdunning : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1164016 Files : /mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/als/eval/InMemoryFactorizationEvaluator.java /mahout/trunk/core/src/main/java/org/apache/mahout/classifier/bayes/InMemoryBayesDatastore.java /mahout/trunk/core/src/main/java/org/apache/mahout/classifier/discriminative/LinearTrainer.java /mahout/trunk/core/src/main/java/org/apache/mahout/classifier/naivebayes/NaiveBayesModel.java /mahout/trunk/core/src/main/java/org/apache/mahout/classifier/naivebayes/training/TrainUtils.java /mahout/trunk/core/src/main/java/org/apache/mahout/classifier/sequencelearning/hmm/HmmUtils.java /mahout/trunk/core/src/main/java/org/apache/mahout/math/MatrixWritable.java /mahout/trunk/core/src/main/java/org/apache/mahout/math/hadoop/decomposer/EigenVerificationJob.java /mahout/trunk/core/src/main/java/org/apache/mahout/math/hadoop/stochasticsvd/UpperTriangular.java /mahout/trunk/core/src/test/java/org/apache/mahout/cf/taste/hadoop/als/ParallelALSFactorizationJobTest.java /mahout/trunk/math/src/main/java/org/apache/mahout/math/AbstractMatrix.java /mahout/trunk/math/src/main/java/org/apache/mahout/math/DenseMatrix.java /mahout/trunk/math/src/main/java/org/apache/mahout/math/DiagonalMatrix.java /mahout/trunk/math/src/main/java/org/apache/mahout/math/Matrix.java /mahout/trunk/math/src/main/java/org/apache/mahout/math/MatrixView.java /mahout/trunk/math/src/main/java/org/apache/mahout/math/PermutedVectorView.java /mahout/trunk/math/src/main/java/org/apache/mahout/math/PivotedMatrix.java /mahout/trunk/math/src/main/java/org/apache/mahout/math/RandomAccessSparseVector.java /mahout/trunk/math/src/main/java/org/apache/mahout/math/RandomTrinaryMatrix.java /mahout/trunk/math/src/main/java/org/apache/mahout/math/SequentialAccessSparseVector.java /mahout/trunk/math/src/main/java/org/apache/mahout/math/SparseColumnMatrix.java /mahout/trunk/math/src/main/java/org/apache/mahout/math/SparseMatrix.java /mahout/trunk/math/src/main/java/org/apache/mahout/math/SparseRowMatrix.java /mahout/trunk/math/src/main/java/org/apache/mahout/math/decomposer/hebbian/HebbianSolver.java /mahout/trunk/math/src/test/java/org/apache/mahout/math/AbstractTestVector.java /mahout/trunk/math/src/test/java/org/apache/mahout/math/MatrixTest.java /mahout/trunk/math/src/test/java/org/apache/mahout/math/TestMatrixView.java /mahout/trunk/math/src/test/java/org/apache/mahout/math/TestSparseColumnMatrix.java /mahout/trunk/math/src/test/java/org/apache/mahout/math/TestSparseMatrix.java /mahout/trunk/math/src/test/java/org/apache/mahout/math/TestSparseRowMatrix.java /mahout/trunk/math/src/test/java/org/apache/mahout/math/TestVectorView.java /mahout/trunk/math/src/test/java/org/apache/mahout/math/als/AlternateLeastSquaresSolverTest.java /mahout/trunk/math/src/test/java/org/apache/mahout/math/decomposer/SolverTest.java
          Hide
          Lance Norskog added a comment - - edited

          Please run this:

           public static void main(String[] args) {
          	  RandomTrinaryMatrix rtm = new RandomTrinaryMatrix(0, 30, 40, true);
          	  for(int row = 0; row < 30; row++) 
          		  for(int column = 0; column < 40; column++)
          			  System.out.println(rtm.get(row, column));	
          	  rtm = new RandomTrinaryMatrix(0, 30, 40, true);
          	  for(int row = 0; row < 30; row++) 
          		  for(int column = 0; column < 40; column++)
          			  System.out.println(rtm.get(row, column));	
            }
          
          Show
          Lance Norskog added a comment - - edited Please run this: public static void main( String [] args) { RandomTrinaryMatrix rtm = new RandomTrinaryMatrix(0, 30, 40, true ); for ( int row = 0; row < 30; row++) for ( int column = 0; column < 40; column++) System .out.println(rtm.get(row, column)); rtm = new RandomTrinaryMatrix(0, 30, 40, true ); for ( int row = 0; row < 30; row++) for ( int column = 0; column < 40; column++) System .out.println(rtm.get(row, column)); }
          Hide
          Ted Dunning added a comment -

          Lance,

          Can you say why? What is it that one would see here?

          Show
          Ted Dunning added a comment - Lance, Can you say why? What is it that one would see here?
          Hide
          Lance Norskog added a comment -

          Yes, bad habit. The "low-grade" mode does not emit 0. The high-grade mode does not emit negative numbers.

          I'm curious: what is the math behind an even split of -1,0,1? The Achlioptas math says +1/-1 or +1/-1/0/0/0 are OK. So, the low-grade mode is proven correct, just not what the comment describes

          Show
          Lance Norskog added a comment - Yes, bad habit. The "low-grade" mode does not emit 0. The high-grade mode does not emit negative numbers. I'm curious: what is the math behind an even split of -1,0,1? The Achlioptas math says +1/-1 or +1/-1/0/0/0 are OK. So, the low-grade mode is proven correct, just not what the comment describes
          Hide
          Dmitriy Lyubimov added a comment -

          so, this is not fully committed yet?

          I don't see our version of CholeskyDecomposition class on the trunk (only apache-math's one). Neighter do i think i can see the SequentialOutOfCoreSVD etc. that seems to be present in the patch posted?

          Show
          Dmitriy Lyubimov added a comment - so, this is not fully committed yet? I don't see our version of CholeskyDecomposition class on the trunk (only apache-math's one). Neighter do i think i can see the SequentialOutOfCoreSVD etc. that seems to be present in the patch posted?
          Hide
          Ted Dunning added a comment -

          No. I was waiting for comments. If I can find time between packing this evening I will commit.

          Show
          Ted Dunning added a comment - No. I was waiting for comments. If I can find time between packing this evening I will commit.
          Hide
          Dmitriy Lyubimov added a comment - - edited

          Another thing that I realized while working on MAHOUT-797, is that this execution plan requires to pass 3 times over input of A instead of existing 1 pass, if we are not saving Y. (first time to comute Y'Y, second time to compute B, and third time to compute U).

          So we either have to save Y, or incur passes over big dataset (assuming A is much densier than Y).

          And we we do save Y then we swap Q for Y, which are of the same size.

          So.. it looks like we got to save Y... or it is not worth it.

          Show
          Dmitriy Lyubimov added a comment - - edited Another thing that I realized while working on MAHOUT-797 , is that this execution plan requires to pass 3 times over input of A instead of existing 1 pass, if we are not saving Y. (first time to comute Y'Y, second time to compute B, and third time to compute U). So we either have to save Y, or incur passes over big dataset (assuming A is much densier than Y). And we we do save Y then we swap Q for Y, which are of the same size. So.. it looks like we got to save Y... or it is not worth it.
          Hide
          Ted Dunning added a comment -

          If you look at my comments in sd-2, you see that Y'Y is computed in one pass and then in the pass where you compute Q'A using Y, I just recompute Y on the fly and solve using R to get a block of Q, also on the fly. Since we are reading blocks of A anyway in this pass, it costs nothing except a small amount of computation to avoid the storing of Y.

          Show
          Ted Dunning added a comment - If you look at my comments in sd-2, you see that Y'Y is computed in one pass and then in the pass where you compute Q'A using Y, I just recompute Y on the fly and solve using R to get a block of Q, also on the fly. Since we are reading blocks of A anyway in this pass, it costs nothing except a small amount of computation to avoid the storing of Y.
          Hide
          Dmitriy Lyubimov added a comment -

          Yes... and that's what I mean. I was implementing it as Y-on-the-fly. But that implies full pass over A every time we need access to Y, and we need it 3 times without option to parallelize. That's why I think I need to save it.

          Also, I am thinking one step ahead, power iterations. In that chain, Yi is AB', and there's no way to compute that on the fly. So, for first stab at it, it would make my life easier to save Y with low degree of replication, and then use the rest of pipeline the same way regardless of i. Besides, I think that saving Y is supposed to make things more efficient, not less, with most generic cases, assuming A >> Y in volume.

          Show
          Dmitriy Lyubimov added a comment - Yes... and that's what I mean. I was implementing it as Y-on-the-fly. But that implies full pass over A every time we need access to Y, and we need it 3 times without option to parallelize. That's why I think I need to save it. Also, I am thinking one step ahead, power iterations. In that chain, Yi is AB', and there's no way to compute that on the fly. So, for first stab at it, it would make my life easier to save Y with low degree of replication, and then use the rest of pipeline the same way regardless of i. Besides, I think that saving Y is supposed to make things more efficient, not less, with most generic cases, assuming A >> Y in volume.
          Hide
          Ted Dunning added a comment -

          Yes... and that's what I mean. I was implementing it as Y-on-the-fly. But that implies full pass over A every time we need access to Y, and we need it 3 times without option to parallelize. That's why I think I need to save it.

          Without the power iterations, every time that Y is needed, A is also scanned. That means that Y-on-the-fly is fine.

          But I think that saving Y is a fine idea. It should usually (but surprisingly, not at all always) be smaller than A. It is the same size as Q.

          Show
          Ted Dunning added a comment - Yes... and that's what I mean. I was implementing it as Y-on-the-fly. But that implies full pass over A every time we need access to Y, and we need it 3 times without option to parallelize. That's why I think I need to save it. Without the power iterations, every time that Y is needed, A is also scanned. That means that Y-on-the-fly is fine. But I think that saving Y is a fine idea. It should usually (but surprisingly, not at all always) be smaller than A. It is the same size as Q.
          Hide
          Ted Dunning added a comment -

          So we either have to save Y, or incur passes over big dataset (assuming A is much densier than Y).

          I think that this got mangled in the typing. A is sparser than Y and may be larger or smaller depending on the average number of non-zero elements. If A is binary or contains only small integers, then it could well be smaller on disk than Y if a continuous random matrix Omega is used. If Omega contains only -1, 0, 1 (trinary values), then the values of Y should compress nearly as well as the values of A and will be dense as well so we don't have to store the indices.

          Show
          Ted Dunning added a comment - So we either have to save Y, or incur passes over big dataset (assuming A is much densier than Y). I think that this got mangled in the typing. A is sparser than Y and may be larger or smaller depending on the average number of non-zero elements. If A is binary or contains only small integers, then it could well be smaller on disk than Y if a continuous random matrix Omega is used. If Omega contains only -1, 0, 1 (trinary values), then the values of Y should compress nearly as well as the values of A and will be dense as well so we don't have to store the indices.
          Hide
          Dmitriy Lyubimov added a comment -

          I like trinary matrix idea. I would like to add it as an option with time. Indeed, may save space fro specific extrasoarse cases.

          Show
          Dmitriy Lyubimov added a comment - I like trinary matrix idea. I would like to add it as an option with time. Indeed, may save space fro specific extrasoarse cases.
          Hide
          Dan Brickley added a comment -

          Looking to try this, I tried applying the last patch per https://cwiki.apache.org/MAHOUT/how-to-contribute.html#HowToContribute-Applyingapatch i.e. 'patch -p 0 -i MAHOUT-792.patch', and get a pile of 'Reversed (or previously applied) patch detected' and a 'can't find file to patch'. Is some but not all of this now committed?

          Per "Github branch MAHOUT-792 available at git://github.com/tdunning/mahout.git" I'm looking around https://github.com/tdunning/mahout but there's no branch listed of that name. Presumably https://github.com/tdunning/mahout/tree/new-stochastic is the place to go?

          Show
          Dan Brickley added a comment - Looking to try this, I tried applying the last patch per https://cwiki.apache.org/MAHOUT/how-to-contribute.html#HowToContribute-Applyingapatch i.e. 'patch -p 0 -i MAHOUT-792 .patch', and get a pile of 'Reversed (or previously applied) patch detected' and a 'can't find file to patch'. Is some but not all of this now committed? Per "Github branch MAHOUT-792 available at git://github.com/tdunning/mahout.git" I'm looking around https://github.com/tdunning/mahout but there's no branch listed of that name. Presumably https://github.com/tdunning/mahout/tree/new-stochastic is the place to go?
          Hide
          Ted Dunning added a comment -

          Yes. The new-stochastic branch is the one to go with. It is a little (but not much) out of date.

          Show
          Ted Dunning added a comment - Yes. The new-stochastic branch is the one to go with. It is a little (but not much) out of date.
          Hide
          Ted Dunning added a comment -

          Committed this new code.

          Show
          Ted Dunning added a comment - Committed this new code.
          Hide
          Ted Dunning added a comment -

          Finally committed these. This provides Cholesky decomposition and several in-memory implementations of SSVD. These are deficient with respect to power iterations.

          Show
          Ted Dunning added a comment - Finally committed these. This provides Cholesky decomposition and several in-memory implementations of SSVD. These are deficient with respect to power iterations.
          Hide
          Hudson added a comment -

          Integrated in Mahout-Quality #1265 (See https://builds.apache.org/job/Mahout-Quality/1265/)
          MAHOUT-792 - More small fixes. Added internal exception class for CholeskyDecomp
          MAHOUT-792 - Made tests use odd sizes to detect row/column confusion. Fixed small errors in out of core SVD
          MAHOUT-792 - New SSVD codes.

          MAHOUT-792 - Copyright fixes for Cholesky
          MAHOUT-792 - Additional test for QR decomposition.
          MAHOUT-792 - Cholesky decomposition

          MAHOUT-792 - New Cholesky decomposition for SSVD update.

          MAHOUT-792 - Switch Cholesky to views.

          tdunning : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1222517
          Files :

          • /mahout/trunk/core/src/main/java/org/apache/mahout/math/ssvd/SequentialOutOfCoreSvd.java
          • /mahout/trunk/math/src/main/java/org/apache/mahout/math/CholeskyDecomposition.java

          tdunning : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1222516
          Files :

          • /mahout/trunk/core/src/test/java/org/apache/mahout/math/ssvd/SequentialOutOfCoreSvdTest.java
          • /mahout/trunk/math/src/main/java/org/apache/mahout/math/ssvd/SequentialBigSvd.java
          • /mahout/trunk/math/src/test/java/org/apache/mahout/math/ssvd/SequentialBigSvdTest.java

          tdunning : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1222515
          Files :

          • /mahout/trunk/core/src/main/java/org/apache/mahout/math/ssvd
          • /mahout/trunk/core/src/main/java/org/apache/mahout/math/ssvd/SequentialOutOfCoreSvd.java
          • /mahout/trunk/core/src/test/java/org/apache/mahout/math/ssvd
          • /mahout/trunk/core/src/test/java/org/apache/mahout/math/ssvd/SequentialOutOfCoreSvdTest.java
          • /mahout/trunk/math/src/main/java/org/apache/mahout/math/CholeskyDecomposition.java
          • /mahout/trunk/math/src/main/java/org/apache/mahout/math/ssvd/SequentialBigSvd.java
          • /mahout/trunk/math/src/test/java/org/apache/mahout/math/CholeskyDecompositionTest.java
          • /mahout/trunk/math/src/test/java/org/apache/mahout/math/ssvd
          • /mahout/trunk/math/src/test/java/org/apache/mahout/math/ssvd/SequentialBigSvdTest.java

          tdunning : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1222514
          Files :

          • /mahout/trunk/math/src/test/java/org/apache/mahout/math/QRDecompositionTest.java

          tdunning : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1222513
          Files :

          • /mahout/trunk/math/src/main/java/org/apache/mahout/math/CholeskyDecomposition.java
          • /mahout/trunk/math/src/test/java/org/apache/mahout/math/CholeskyDecompositionTest.java
          Show
          Hudson added a comment - Integrated in Mahout-Quality #1265 (See https://builds.apache.org/job/Mahout-Quality/1265/ ) MAHOUT-792 - More small fixes. Added internal exception class for CholeskyDecomp MAHOUT-792 - Made tests use odd sizes to detect row/column confusion. Fixed small errors in out of core SVD MAHOUT-792 - New SSVD codes. MAHOUT-792 - Copyright fixes for Cholesky MAHOUT-792 - Additional test for QR decomposition. MAHOUT-792 - Cholesky decomposition MAHOUT-792 - New Cholesky decomposition for SSVD update. MAHOUT-792 - Switch Cholesky to views. tdunning : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1222517 Files : /mahout/trunk/core/src/main/java/org/apache/mahout/math/ssvd/SequentialOutOfCoreSvd.java /mahout/trunk/math/src/main/java/org/apache/mahout/math/CholeskyDecomposition.java tdunning : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1222516 Files : /mahout/trunk/core/src/test/java/org/apache/mahout/math/ssvd/SequentialOutOfCoreSvdTest.java /mahout/trunk/math/src/main/java/org/apache/mahout/math/ssvd/SequentialBigSvd.java /mahout/trunk/math/src/test/java/org/apache/mahout/math/ssvd/SequentialBigSvdTest.java tdunning : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1222515 Files : /mahout/trunk/core/src/main/java/org/apache/mahout/math/ssvd /mahout/trunk/core/src/main/java/org/apache/mahout/math/ssvd/SequentialOutOfCoreSvd.java /mahout/trunk/core/src/test/java/org/apache/mahout/math/ssvd /mahout/trunk/core/src/test/java/org/apache/mahout/math/ssvd/SequentialOutOfCoreSvdTest.java /mahout/trunk/math/src/main/java/org/apache/mahout/math/CholeskyDecomposition.java /mahout/trunk/math/src/main/java/org/apache/mahout/math/ssvd/SequentialBigSvd.java /mahout/trunk/math/src/test/java/org/apache/mahout/math/CholeskyDecompositionTest.java /mahout/trunk/math/src/test/java/org/apache/mahout/math/ssvd /mahout/trunk/math/src/test/java/org/apache/mahout/math/ssvd/SequentialBigSvdTest.java tdunning : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1222514 Files : /mahout/trunk/math/src/test/java/org/apache/mahout/math/QRDecompositionTest.java tdunning : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1222513 Files : /mahout/trunk/math/src/main/java/org/apache/mahout/math/CholeskyDecomposition.java /mahout/trunk/math/src/test/java/org/apache/mahout/math/CholeskyDecompositionTest.java
          Hide
          Ted Dunning added a comment -

          I don't think that this issue should have been closed.

          Show
          Ted Dunning added a comment - I don't think that this issue should have been closed.
          Hide
          Ted Dunning added a comment -

          My current theory is that the problem has to do with the ordering of the result of File.listFiles. I just committed a cleanup fix. The cleanups were important in any case.

          Let's see what Jenkins has to say.

          Show
          Ted Dunning added a comment - My current theory is that the problem has to do with the ordering of the result of File.listFiles. I just committed a cleanup fix. The cleanups were important in any case. Let's see what Jenkins has to say.
          Hide
          Ted Dunning added a comment -

          Jenkins succeeded after this change. Good news.

          I will try on an EC2 instance to see if I can.

          Hmm... https://builds.apache.org/job/mahout-nightly/752/console shows the evil test still failing.

          Very confusing.

          Show
          Ted Dunning added a comment - Jenkins succeeded after this change. Good news. I will try on an EC2 instance to see if I can. Hmm... https://builds.apache.org/job/mahout-nightly/752/console shows the evil test still failing. Very confusing.
          Hide
          Ted Dunning added a comment -

          This is not obviously an ubuntu portability issue. We have failures on ubuntu4 and ubuntu5, successes on ubuntu3 and solaris1. And then a failure on ubuntu3.

          Show
          Ted Dunning added a comment - This is not obviously an ubuntu portability issue. We have failures on ubuntu4 and ubuntu5, successes on ubuntu3 and solaris1. And then a failure on ubuntu3.
          Hide
          Ted Dunning added a comment -

          OK. I think I have this failing test corralled. The problem was that the test case was reading the results and assuming the blocks in the result matrix would be sorted. This is true on the mac and sometimes true on ubuntu. Once it fails, it is pretty consistent.

          I put a sort into test and all seems well. I replicated the failure 3 times in a row, added the sort and verified that the failure did not occur on multiple runs. I removed the sort and the failure returned consistently.

          Fix is forthcoming shortly as soon as I do the same tests on ubuntu 10 as well as 11.

          Show
          Ted Dunning added a comment - OK. I think I have this failing test corralled. The problem was that the test case was reading the results and assuming the blocks in the result matrix would be sorted. This is true on the mac and sometimes true on ubuntu. Once it fails, it is pretty consistent. I put a sort into test and all seems well. I replicated the failure 3 times in a row, added the sort and verified that the failure did not occur on multiple runs. I removed the sort and the failure returned consistently. Fix is forthcoming shortly as soon as I do the same tests on ubuntu 10 as well as 11.
          Hide
          Hudson added a comment -

          Integrated in Mahout-Quality #1309 (See https://builds.apache.org/job/Mahout-Quality/1309/)
          MAHOUT-792 - Forced correct block ordering in out-of-core SVD. Hopefully addresses ubuntu test failures. Also forced file closing.

          tdunning : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1231800
          Files :

          • /mahout/trunk/core/src/main/java/org/apache/mahout/math/ssvd/SequentialOutOfCoreSvd.java
          • /mahout/trunk/core/src/test/java/org/apache/mahout/math/ssvd/SequentialOutOfCoreSvdTest.java
          Show
          Hudson added a comment - Integrated in Mahout-Quality #1309 (See https://builds.apache.org/job/Mahout-Quality/1309/ ) MAHOUT-792 - Forced correct block ordering in out-of-core SVD. Hopefully addresses ubuntu test failures. Also forced file closing. tdunning : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1231800 Files : /mahout/trunk/core/src/main/java/org/apache/mahout/math/ssvd/SequentialOutOfCoreSvd.java /mahout/trunk/core/src/test/java/org/apache/mahout/math/ssvd/SequentialOutOfCoreSvdTest.java
          Hide
          Ted Dunning added a comment -

          Last few Jenkins runs confirm my expectation that this problem was a bug in the test case itself that is now fixed.

          Show
          Ted Dunning added a comment - Last few Jenkins runs confirm my expectation that this problem was a bug in the test case itself that is now fixed.

            People

            • Assignee:
              Ted Dunning
              Reporter:
              Ted Dunning
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development