Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-21305

The BKM (best known methods) of using native BLAS to improvement ML/MLLIB performance

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 2.2.0
    • Fix Version/s: 2.3.0
    • Component/s: Documentation, ML
    • Labels:
      None

      Description

      Many ML/MLLIB algorithms use native BLAS (like Intel MKL, ATLAS, OpenBLAS) to improvement the performance.
      The methods to use native BLAS is important for the performance, sometimes (high opportunity) native BLAS even causes worse performance.
      For example, for the ALS recommendForAll method before SPARK 2.2 which uses BLAS gemm for matrix multiplication.
      If you only test the matrix multiplication performance of native BLAS gemm (like Intel MKL, and OpenBLAS) and netlib-java F2j BLAS gemm , the native BLAS is about 10X performance improvement. But if you test the Spark Job end-to-end performance, F2j is much faster than native BLAS, very interesting.

      I spend much time for this problem, and find we should not use native BLAS (like OpenBLAS and Intel MKL) which support multi-thread with no any setting. By default, this native BLAS will enable multi-thread, which will conflict with Spark executor. You can use multi-thread native BLAS, but it is better to disable multi-thread first.

      https://github.com/xianyi/OpenBLAS/wiki/faq#multi-threaded
      https://software.intel.com/en-us/articles/recommended-settings-for-calling-intel-mkl-routines-from-multi-threaded-applications

      I think we should add some comments in docs/ml-guilde.md for this first.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                peng.meng@intel.com Peng Meng
                Reporter:
                peng.meng@intel.com Peng Meng
              • Votes:
                0 Vote for this issue
                Watchers:
                8 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - 504h
                  504h
                  Remaining:
                  Remaining Estimate - 504h
                  504h
                  Logged:
                  Time Spent - Not Specified
                  Not Specified