Details

Type: New Feature

Status: Closed

Priority: Minor

Resolution: Duplicate

Affects Version/s: 0.7

Fix Version/s: 0.9

Component/s: None

Labels:
Description
Implement a multi layer perceptron
 via Matrix Multiplication
 Learning by Backpropagation; implementing tricks by Yann LeCun et al.: "Efficent Backprop"
 arbitrary number of hidden layers (also 0  just the linear model)
 connection between proximate layers only
 different cost and activation functions (different activation function in each layer)
 test of backprop by gradient checking
 normalization of the inputs (storeable) as part of the model
First:
 implementation "stocastic gradient descent" like gradient machine
 simple gradient descent incl. momentum
Later (new jira issues):
 Distributed Batch learning (see below)
 "Stacked (Denoising) Autoencoder"  Feature Learning
 advanced cost minimazation like 2nd order methods, conjugate gradient etc.
Distribution of learning can be done by (batch learning):
1 Partioning of the data in x chunks
2 Learning the weight changes as matrices in each chunk
3 Combining the matrixes and update of the weights  back to 2
Maybe this procedure can be done with random parts of the chunks (distributed quasi online learning).
Batch learning with deltabardelta heuristics for adapting the learning rates.
Issue Links
 is superceded by

MAHOUT1265 Add Multilayer Perceptron
 Closed
Activity
Christian Herta
created issue 
Christian Herta
made changes 
Field  Original Value  New Value 

Original Estimate  336h [ 1209600 ]  80h [ 288000 ] 
Remaining Estimate  336h [ 1209600 ]  80h [ 288000 ] 
Christian Herta
made changes 
Description 
Implement a multi layer perceptron
* via Matrix Multiplication * Learning by Backpropagation; implementing tricks by Yann LeCun et al.: "Efficent Backprop" * arbitrary number of hidden layers (also 0  just the linear model) * connection between proximate layers only * different cost and activation functions (different activation function in each layer) * test of backprop by numerically gradient checking First: * implementation "stocastic gradient descent" like gradient machine Later (new jira issues): * Distributed Batch learning (see below) * "Stacked (Denoising) Autoencoder"  Feature Learning Distribution of learning can be done in batch learning by: 1 Partioning of the data in x chunks 2 Learning the weight changes as matrices in each chunk 3 Combining the matrixes and update of the weights  back to 2 Maybe is procedure can be done with random parts of the chunks (distributed quasi online learning) 
Implement a multi layer perceptron
* via Matrix Multiplication * Learning by Backpropagation; implementing tricks by Yann LeCun et al.: "Efficent Backprop" * arbitrary number of hidden layers (also 0  just the linear model) * connection between proximate layers only * different cost and activation functions (different activation function in each layer) * test of backprop by gradient checking First: * implementation "stocastic gradient descent" like gradient machine Later (new jira issues): * Distributed Batch learning (see below) * "Stacked (Denoising) Autoencoder"  Feature Learning Distribution of learning can be done in batch learning by: 1 Partioning of the data in x chunks 2 Learning the weight changes as matrices in each chunk 3 Combining the matrixes and update of the weights  back to 2 Maybe is procedure can be done with random parts of the chunks (distributed quasi online learning) 
Christian Herta
made changes 
Description 
Implement a multi layer perceptron
* via Matrix Multiplication * Learning by Backpropagation; implementing tricks by Yann LeCun et al.: "Efficent Backprop" * arbitrary number of hidden layers (also 0  just the linear model) * connection between proximate layers only * different cost and activation functions (different activation function in each layer) * test of backprop by gradient checking First: * implementation "stocastic gradient descent" like gradient machine Later (new jira issues): * Distributed Batch learning (see below) * "Stacked (Denoising) Autoencoder"  Feature Learning Distribution of learning can be done in batch learning by: 1 Partioning of the data in x chunks 2 Learning the weight changes as matrices in each chunk 3 Combining the matrixes and update of the weights  back to 2 Maybe is procedure can be done with random parts of the chunks (distributed quasi online learning) 
Implement a multi layer perceptron
* via Matrix Multiplication * Learning by Backpropagation; implementing tricks by Yann LeCun et al.: "Efficent Backprop" * arbitrary number of hidden layers (also 0  just the linear model) * connection between proximate layers only * different cost and activation functions (different activation function in each layer) * test of backprop by gradient checking First: * implementation "stocastic gradient descent" like gradient machine Later (new jira issues): * Distributed Batch learning (see below) * "Stacked (Denoising) Autoencoder"  Feature Learning Distribution of learning can be done in batch learning by: 1 Partioning of the data in x chunks 2 Learning the weight changes as matrices in each chunk 3 Combining the matrixes and update of the weights  back to 2 Maybe this procedure can be done with random parts of the chunks (distributed quasi online learning) 
Christian Herta
made changes 
Description 
Implement a multi layer perceptron
* via Matrix Multiplication * Learning by Backpropagation; implementing tricks by Yann LeCun et al.: "Efficent Backprop" * arbitrary number of hidden layers (also 0  just the linear model) * connection between proximate layers only * different cost and activation functions (different activation function in each layer) * test of backprop by gradient checking First: * implementation "stocastic gradient descent" like gradient machine Later (new jira issues): * Distributed Batch learning (see below) * "Stacked (Denoising) Autoencoder"  Feature Learning Distribution of learning can be done in batch learning by: 1 Partioning of the data in x chunks 2 Learning the weight changes as matrices in each chunk 3 Combining the matrixes and update of the weights  back to 2 Maybe this procedure can be done with random parts of the chunks (distributed quasi online learning) 
Implement a multi layer perceptron
* via Matrix Multiplication * Learning by Backpropagation; implementing tricks by Yann LeCun et al.: "Efficent Backprop" * arbitrary number of hidden layers (also 0  just the linear model) * connection between proximate layers only * different cost and activation functions (different activation function in each layer) * test of backprop by gradient checking First: * implementation "stocastic gradient descent" like gradient machine Later (new jira issues): * Distributed Batch learning (see below) * "Stacked (Denoising) Autoencoder"  Feature Learning Distribution of learning can be done by (batch learning): 1 Partioning of the data in x chunks 2 Learning the weight changes as matrices in each chunk 3 Combining the matrixes and update of the weights  back to 2 Maybe this procedure can be done with random parts of the chunks (distributed quasi online learning) 
Christian Herta
made changes 
Description 
Implement a multi layer perceptron
* via Matrix Multiplication * Learning by Backpropagation; implementing tricks by Yann LeCun et al.: "Efficent Backprop" * arbitrary number of hidden layers (also 0  just the linear model) * connection between proximate layers only * different cost and activation functions (different activation function in each layer) * test of backprop by gradient checking First: * implementation "stocastic gradient descent" like gradient machine Later (new jira issues): * Distributed Batch learning (see below) * "Stacked (Denoising) Autoencoder"  Feature Learning Distribution of learning can be done by (batch learning): 1 Partioning of the data in x chunks 2 Learning the weight changes as matrices in each chunk 3 Combining the matrixes and update of the weights  back to 2 Maybe this procedure can be done with random parts of the chunks (distributed quasi online learning) 
Implement a multi layer perceptron
* via Matrix Multiplication * Learning by Backpropagation; implementing tricks by Yann LeCun et al.: "Efficent Backprop" * arbitrary number of hidden layers (also 0  just the linear model) * connection between proximate layers only * different cost and activation functions (different activation function in each layer) * test of backprop by gradient checking First: * implementation "stocastic gradient descent" like gradient machine * simple gradient descent Later (new jira issues): * momentum for better and faster learning * advanced cost minimazation like 2nd order methods, conjugate gradient etc. * Distributed Batch learning (see below) * "Stacked (Denoising) Autoencoder"  Feature Learning Distribution of learning can be done by (batch learning): 1 Partioning of the data in x chunks 2 Learning the weight changes as matrices in each chunk 3 Combining the matrixes and update of the weights  back to 2 Maybe this procedure can be done with random parts of the chunks (distributed quasi online learning) 
Christian Herta
made changes 
Description 
Implement a multi layer perceptron
* via Matrix Multiplication * Learning by Backpropagation; implementing tricks by Yann LeCun et al.: "Efficent Backprop" * arbitrary number of hidden layers (also 0  just the linear model) * connection between proximate layers only * different cost and activation functions (different activation function in each layer) * test of backprop by gradient checking First: * implementation "stocastic gradient descent" like gradient machine * simple gradient descent Later (new jira issues): * momentum for better and faster learning * advanced cost minimazation like 2nd order methods, conjugate gradient etc. * Distributed Batch learning (see below) * "Stacked (Denoising) Autoencoder"  Feature Learning Distribution of learning can be done by (batch learning): 1 Partioning of the data in x chunks 2 Learning the weight changes as matrices in each chunk 3 Combining the matrixes and update of the weights  back to 2 Maybe this procedure can be done with random parts of the chunks (distributed quasi online learning) 
Implement a multi layer perceptron
* via Matrix Multiplication * Learning by Backpropagation; implementing tricks by Yann LeCun et al.: "Efficent Backprop" * arbitrary number of hidden layers (also 0  just the linear model) * connection between proximate layers only * different cost and activation functions (different activation function in each layer) * test of backprop by gradient checking First: * implementation "stocastic gradient descent" like gradient machine * simple gradient descent incl. momentum Later (new jira issues): * Distributed Batch learning (see below) * "Stacked (Denoising) Autoencoder"  Feature Learning * advanced cost minimazation like 2nd order methods, conjugate gradient etc. Distribution of learning can be done by (batch learning): 1 Partioning of the data in x chunks 2 Learning the weight changes as matrices in each chunk 3 Combining the matrixes and update of the weights  back to 2 Maybe this procedure can be done with random parts of the chunks (distributed quasi online learning). Batch learning with deltabardelta heuristics for adapting the learning rates. 
Christian Herta
made changes 
Status  Open [ 1 ]  Patch Available [ 10002 ] 
Christian Herta
made changes 
Attachment  MAHOUT976.patch [ 12514275 ] 
Christian Herta
made changes 
Comment 
[ uncomplete and completly untested
should only compile ] 
Christian Herta
made changes 
Description 
Implement a multi layer perceptron
* via Matrix Multiplication * Learning by Backpropagation; implementing tricks by Yann LeCun et al.: "Efficent Backprop" * arbitrary number of hidden layers (also 0  just the linear model) * connection between proximate layers only * different cost and activation functions (different activation function in each layer) * test of backprop by gradient checking First: * implementation "stocastic gradient descent" like gradient machine * simple gradient descent incl. momentum Later (new jira issues): * Distributed Batch learning (see below) * "Stacked (Denoising) Autoencoder"  Feature Learning * advanced cost minimazation like 2nd order methods, conjugate gradient etc. Distribution of learning can be done by (batch learning): 1 Partioning of the data in x chunks 2 Learning the weight changes as matrices in each chunk 3 Combining the matrixes and update of the weights  back to 2 Maybe this procedure can be done with random parts of the chunks (distributed quasi online learning). Batch learning with deltabardelta heuristics for adapting the learning rates. 
Implement a multi layer perceptron
* via Matrix Multiplication * Learning by Backpropagation; implementing tricks by Yann LeCun et al.: "Efficent Backprop" * arbitrary number of hidden layers (also 0  just the linear model) * connection between proximate layers only * different cost and activation functions (different activation function in each layer) * test of backprop by gradient checking * normalization of the inputs (storeable) as part of the model First: * implementation "stocastic gradient descent" like gradient machine * simple gradient descent incl. momentum Later (new jira issues): * Distributed Batch learning (see below) * "Stacked (Denoising) Autoencoder"  Feature Learning * advanced cost minimazation like 2nd order methods, conjugate gradient etc. Distribution of learning can be done by (batch learning): 1 Partioning of the data in x chunks 2 Learning the weight changes as matrices in each chunk 3 Combining the matrixes and update of the weights  back to 2 Maybe this procedure can be done with random parts of the chunks (distributed quasi online learning). Batch learning with deltabardelta heuristics for adapting the learning rates. 
Christian Herta
made changes 
Attachment  MAHOUT976.patch [ 12514809 ] 
Christian Herta
made changes 
Attachment  MAHOUT976.patch [ 12515722 ] 
Christian Herta
made changes 
Attachment  MAHOUT976.patch [ 12516203 ] 
Robin Anil
made changes 
Assignee  Ted Dunning [ tdunning ] 
Robin Anil
made changes 
Fix Version/s  0.8 [ 12320153 ] 
Robin Anil
made changes 
Fix Version/s  Backlog [ 12318886 ]  
Fix Version/s  0.8 [ 12320153 ] 
Suneel Marthi
made changes 
Link 
This issue is superceded by 
Suneel Marthi
made changes 
Status  Patch Available [ 10002 ]  Resolved [ 5 ] 
Fix Version/s  0.9 [ 12324577 ]  
Resolution  Duplicate [ 3 ] 
Suneel Marthi
made changes 
Fix Version/s  Backlog [ 12318886 ] 
Suneel Marthi
made changes 
Status  Resolved [ 5 ]  Closed [ 6 ] 
Assignee  Ted Dunning [ tdunning ]  Suneel Marthi [ smarthi ] 
Transition  Time In Source Status  Execution Times  Last Executer  Last Execution Date  


4d 19h 50m  1  Christian Herta  12/Feb/12 16:30  

677d 4h 47m  1  Suneel Marthi  20/Dec/13 21:17  

44d 10h 39m  1  Suneel Marthi  03/Feb/14 07:57 
Although it's not the same (but again a NN) and afaik the learning is sequential, but it's worth to check out the restricted boltzmann machine implementation that has been just submitted to
MAHOUT968