SPARK-11968 relates to excessive GC pressure from using the "blocked BLAS 3" approach for generating top-k recommendations in mllib.recommendation.MatrixFactorizationModel.
The solution there is still based on blocking factors, but efficiently computes the top-k elements per block first (using BoundedPriorityQueue) and then computes the global top-k elements.
This improves performance and GC pressure substantially for mllib's ALS model. The same approach is also a lot more efficient than the current "crossJoin and score per-row" used in ml's DataFrame-based method. This adapts the solution in
SPARK-11968 for DataFrame.