Description
Separate out linear algebra as a standalone module without Spark dependency to simplify production deployment. We can call the new module mllib-local, which might contain local models in the future.
The major issue is to remove dependencies on user-defined types.
The package name will be changed from mllib to ml. For example, Vector will be changed from `org.apache.spark.mllib.linalg.Vector` to `org.apache.spark.ml.linalg.Vector`. The return vector type in the new ML pipeline will be the one in ML package; however, the existing mllib code will not be touched. As a result, this will potentially break the API. Also, when the vector is loaded from mllib vector by Spark SQL, the vector will automatically converted into the one in ml package.
Attachments
Issue Links
- blocks
-
SPARK-14707 Linear algebra: clarify light vs heavy constructors and accessors
- Resolved
- relates to
-
SPARK-14739 Vectors.parse doesn't handle dense vectors of size 0 and sparse vectors with no indices
- Resolved