[SPARK-21305] The BKM (best known methods) of using native BLAS to improvement ML/MLLIB performance - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Minor
Resolution: Fixed
Affects Version/s: 2.2.0
Fix Version/s: 2.3.0
Component/s: Documentation, ML
Labels:
None

Description

Many ML/MLLIB algorithms use native BLAS (like Intel MKL, ATLAS, OpenBLAS) to improvement the performance.
The methods to use native BLAS is important for the performance, sometimes (high opportunity) native BLAS even causes worse performance.
For example, for the ALS recommendForAll method before SPARK 2.2 which uses BLAS gemm for matrix multiplication.
If you only test the matrix multiplication performance of native BLAS gemm (like Intel MKL, and OpenBLAS) and netlib-java F2j BLAS gemm , the native BLAS is about 10X performance improvement. But if you test the Spark Job end-to-end performance, F2j is much faster than native BLAS, very interesting.

I spend much time for this problem, and find we should not use native BLAS (like OpenBLAS and Intel MKL) which support multi-thread with no any setting. By default, this native BLAS will enable multi-thread, which will conflict with Spark executor. You can use multi-thread native BLAS, but it is better to disable multi-thread first.

https://github.com/xianyi/OpenBLAS/wiki/faq#multi-threaded
https://software.intel.com/en-us/articles/recommended-settings-for-calling-intel-mkl-routines-from-multi-threaded-applications

I think we should add some comments in docs/ml-guilde.md for this first.

Attachments

Issue Links

is related to

SPARK-21688 performance improvement in mllib SVM with native BLAS

Resolved

links to

[Github] Pull Request #18551 (mpjlu)

Activity

People

Assignee:: Peng Meng

Reporter:: Peng Meng

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Dates

Created:: 04/Jul/17 14:53

Updated:: 10/Aug/17 08:03

Resolved:: 12/Jul/17 10:02

Time Tracking

Estimated:

504h

Remaining:

504h

Logged:

Not Specified