Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-16820

Sparse - Sparse matrix multiplication

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Incomplete
    • 2.0.0
    • None
    • ML

    Description

      While working on MCL implementation on Spark we have encountered some difficulties.
      The main part of this process is distributed sparse matrix multiplication that has two main steps:
      1. Simulate multiply – preparation before the real multiplication in order to see which blocks should be multiplied.
      2. The actual blocks multiplication and summation.

      In our case the sparse matrix has 50M rows and columns, and 2B non-zeros.

      The current multiplication suffers from these issues:

      1. A relatively trivial bug already fixed in the first step the caused the process to be very slow SPARK-16469
      2. Still after the bug fix, if we have too many blocks the Simulate multiply will take very long time and will multiply the data many times. (O(n^3) where n is the number of blocks)
      3. Spark supports only multiplication with Dense matrices. Thus, it converts a Sparse matrix into a dense matrix before the multiplication.
      4. For summing the intermediate block results Spark uses Breeze’s CSC matrix operations – here the problem is that it is very inefficient to update a CSC matrix in a zero value.

      That means that with many blocks (default block size is 1024) – in our case 50M/1024 ~= 50K, the simulate multiply will effectively never finish or will generate 50K*16GB ~= 1000TB of data. On the other hand, if we use bigger block size e.g. 100k we get OutOfMemoryException in the “toDense” method of the multiply. We have worked around that by implementing our-selves both the Sparse multiplication and addition in a very naïve way – but at least it works.

      Attachments

        Activity

          People

            Unassigned Unassigned
            uzadude Ohad Raviv
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: