We would like to add parallel implementation of Stacked Auto Encoder (Deep Learning ) algorithm to Spark MLLib.
SAE is one of the most popular Deep Learning algorithms. It has achieved successful benchmarks in MNIST hand written classifications, Google's ICML2012 "cat face" paper (http://icml.cc/2012/papers/73.pdf), etc.
Our focus is to leverage the RDD and get the SAE with the following capability with ease of use for both beginners and advanced researchers:
1, multi layer SAE deep network training and scoring.
2, unsupervised feature learning.
3, supervised learning with multinomial logistic regression (softmax).