Even though we may not be able to provide high-performance BLAS libraries (due to licenses, system dependencies, etc.), we could provide better documentation about how to link with such libraries on various systems. This could be a new section in the MLlib programming guide.
Tuned libraries can be much faster than default ones. See discussion on this dev@spark thread .