Details
-
New Feature
-
Status: Closed
-
Minor
-
Resolution: Won't Do
-
None
-
None
Description
Dimension reduction is a crucial prerequisite for many data analysis tasks. Therefore, Flink's machine learning library should contain a principal components analysis (PCA) implementation. Maria-Florina Balcan et al. [1] proposes a distributed PCA. A more recent publication [2] describes another scalable PCA implementation.
Resources:
[1] http://arxiv.org/pdf/1408.5823v5.pdf
[2] http://ds.qcri.org/images/profile/tarek_elgamal/sigmod2015.pdf