Details

Improvement

Status: Resolved

Major

Resolution: Fixed

0.6.3
Description
In principle, linear regression, logistic regression, MLP, autoencoder, and deepNets can be represented by a generic neural network model. Using a generic model and making the concrete models derive it can increase the reusability of the code.
More concretely:
Linear regression is a two level neural network (one input layer and one output layer) by setting the squashing function as identity function f( x ) = x, and cost function as squared error.
Logistic regression is similar to linear regression, except that the squashing function is set as sigmoid and cost function is set as cross entropy.
MLP is a neural nets with at least 2 layers of neurons. The squashing function can be sigmoid, tanh (may be more) and cost function can be cross entropy, squared error (may be more).
(sparse) autoencoder can be used for dimensional reduction (nonlinear) and anomaly detection. Also, it can be used as the building block of deep nets.
Generally it is a three layer neural networks, where the size of input layer is the same as output layer, and the size of hidden layer is typically less than that of the input/output layer. Its cost function is squared error + KL divergence.
deepNets is used for deep learning, a simple architecture is to stack several autoencoder together.
The steps:
1. Create package 'org.apache.hama.ml.ann'. This package is used to put the abstract models and trainer (an abstract trainer that defines the interfaces for the concrete model trainer). The concrete implementation of training is better to be detached from the model, so that new training method can easily be added in the future.
2. Move PerceptronTrainer from 'org.apache.hama.ml.perceptron' to 'org.apache.hama.ml.ann' and rename it to NeuralNetworkTrainer. The API defined in this class is generic enough to be reused by all the neural network trainers.
3. Add abstract NeuralNetwork, AbstractLayeredNeuralNetwork, SmallLayeredNeuralNetwork in to above package.
a. NeuralNetwork defines the common behaviors of all neural network based models.
b. AbstractLayeredNeuralNetwork defines the common behaviors of all layered neural network based model.
c. SmallLayeredNeuralNetwork defines the common behaviors of all layered neural network whose topology can be loaded into one machine.