Uploaded image for project: 'SystemDS'
  1. SystemDS
  2. SYSTEMDS-1140

Sparse/Caching performance bugs related to deep learning scripts

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • SystemML 1.0.0, SystemML 1.1
    • SystemML 1.1
    • None
    • None

    Description

      We have identified two performance bugs that frequently occurs in deep learning script.

      First, we repeatedly perform unnecessary conversion to sparse format. Also, the operations such as matrix multiplication (including BLAS and CuBLAS) are optimized for dense.

      Second, even with large memory budget, we sometimes spend almost 20-30% time in caching.

      mboehm7 reinwald mwdusenb@us.ibm.com I am labeling this bug as blocker for SystemML 1.0. Please feel free to assign this issue to yourself.

      Improvements so far:

      1. Disabled sparse conversions & caching,  by commit

      2. binary sparse-dense mult/div, preallocation by commit 

      3. For `conv_2d_bias_add`, the `elementWiseInPlaceTransposedAddition` method - first, aggreates partial blocks w/o transpose. secondly, does a cache conscious transpose to output. by commit

      4. serialization overhead of sparse matrices(in MCSR) on bufferpool write, by using inMemorySize of cache block. by commit 

      5. removeEmpty(rows) or order perfomance improved by , shallow copy of sparse rows, exploiting the fact that removeEmpty(rows) and order do not modify the actual sparse rows. by commit

      Attachments

        Issue Links

          Activity

            People

              mboehm7 Matthias Boehm
              niketanpansare Niketan Pansare
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: