Mahout
  1. Mahout
  2. MAHOUT-968

Classifier based on restricted boltzmann machines

    Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Won't Fix
    • Affects Version/s: 0.7
    • Fix Version/s: None
    • Component/s: Classification

      Description

      This is a proposal for a new classifier based on restricted boltzmann machines. The development of this feature follows the paper on "Deep Boltzmann Machines" (DBM) [1] from 2009. The proposed model (DBM) got an error rate of 0.95% on the mnist dataset [2], which is really good. Main parts of the implementation should also be applicable to other scenarios than classification where restricted boltzmann machines are used (ref. MAHOUT-375).
      I am working on this feature right now, and the results are promising. The only problem with the training algorithm is, that it is still mostly sequential (if training batches are small, what they should be), which makes Map/Reduce until now, not really beneficial. However, since the algorithm itself is fast (for a training algorithm), training can be done on a single machine in managable time.
      Testing of the algorithm is currently done on the mnist dataset itself to reproduce results of [1]. As soon as results indicate, that everything is working fine, I will upload the patch.

      [1] http://www.cs.toronto.edu/~hinton/absps/dbm.pdf
      [2] http://yann.lecun.com/exdb/mnist/

      1. MAHOUT-968.patch
        137 kB
        Dirk Weißenborn
      2. MAHOUT-968.patch
        131 kB
        Dirk Weißenborn

        Activity

        Dirk Weißenborn created issue -
        Hide
        Viktor Gal added a comment -

        any patch available for test? as i'd gladly give it a go for testing...thnx

        Show
        Viktor Gal added a comment - any patch available for test? as i'd gladly give it a go for testing...thnx
        Hide
        Dirk Weißenborn added a comment - - edited

        Here comes the first patch! hope it is working!
        training:
        org.apache.mahout.classifier.rbm.training.RBMClassifierTrainingJob

        can be run with mapreduce or locally:
        ex,:
        java org.apache.mahout.classifier.rbm.training.RBMClassifierTrainingJob --input dirOrFile --output pathWhereModelShouldBeWritten --labelcount 10 --epochs 30 --monitor

        Training consists of 3 steps: initialize biases, greedy pretraining, finetuning... however it is possible to train the model on few of them at a time (options: --nogreedy --nobiases --nofinetuning

        input has to be sequencefile of <IntWritable, VectorWritable> where the Integer is the label

        testing:
        org.apache.mahout.classifier.rbm.test.TestRBMClassifierJob should be clearer

        Preparation of mnist dataset in examples:
        org.apache.mahout.classifier.rbm.MnistPreparer
        "size" is number of examples being processed into "chunknumber" minibatches (or chunks), labelpath and imagepath refer to the training-/testdata from the mnist dataset

        I am doing my own tests on the mnist dataset right now and it is nearly done. It is taking some time because its size but manageable. I can upload the trained model if someone wants it for testing.

        Show
        Dirk Weißenborn added a comment - - edited Here comes the first patch! hope it is working! training: org.apache.mahout.classifier.rbm.training.RBMClassifierTrainingJob can be run with mapreduce or locally: ex,: java org.apache.mahout.classifier.rbm.training.RBMClassifierTrainingJob --input dirOrFile --output pathWhereModelShouldBeWritten --labelcount 10 --epochs 30 --monitor Training consists of 3 steps: initialize biases, greedy pretraining, finetuning... however it is possible to train the model on few of them at a time (options: --nogreedy --nobiases --nofinetuning input has to be sequencefile of <IntWritable, VectorWritable> where the Integer is the label testing: org.apache.mahout.classifier.rbm.test.TestRBMClassifierJob should be clearer Preparation of mnist dataset in examples: org.apache.mahout.classifier.rbm.MnistPreparer "size" is number of examples being processed into "chunknumber" minibatches (or chunks), labelpath and imagepath refer to the training-/testdata from the mnist dataset I am doing my own tests on the mnist dataset right now and it is nearly done. It is taking some time because its size but manageable. I can upload the trained model if someone wants it for testing.
        Dirk Weißenborn made changes -
        Field Original Value New Value
        Status Open [ 1 ] Patch Available [ 10002 ]
        Affects Version/s 0.7 [ 12319261 ]
        Fix Version/s 0.7 [ 12319261 ]
        Dirk Weißenborn made changes -
        Comment [ This is the initial patch ]
        Dirk Weißenborn made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Hide
        Dirk Weißenborn added a comment -

        initial patch

        Show
        Dirk Weißenborn added a comment - initial patch
        Dirk Weißenborn made changes -
        Attachment MAHOUT-968.patch [ 12513662 ]
        Hide
        Dirk Weißenborn added a comment -

        initial patch submitted

        Show
        Dirk Weißenborn added a comment - initial patch submitted
        Dirk Weißenborn made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Hide
        Viktor Gal added a comment -

        great! i'll give it a go on my dataset and will let you know how it goes!

        Show
        Viktor Gal added a comment - great! i'll give it a go on my dataset and will let you know how it goes!
        Hide
        Viktor Gal added a comment -

        Dirk is this patch agains the HEAD of the trunk?
        i've tried to apply it but the patch agains:

        • pom.xml and
        • src/main/java/org/apache/mahout/cf/taste/impl/model/jdbc/ConnectionPoolDataSource.java

        failed

        Show
        Viktor Gal added a comment - Dirk is this patch agains the HEAD of the trunk? i've tried to apply it but the patch agains: pom.xml and src/main/java/org/apache/mahout/cf/taste/impl/model/jdbc/ConnectionPoolDataSource.java failed
        Hide
        Dirk Weißenborn added a comment - - edited

        new patch version, hope that solves the problem! i had some issues with slf4j and my build fails when i try to install a fresh checkout with this error:
        ConnectionPoolDataSource.java:[39,13] error: ConnectionPoolDataSource is not abstract and does not override abstract method getParentLogger() in CommonDataSource...
        the new patch doesnt change anything in the pom or the ConnectionPoolDataSource...

        Show
        Dirk Weißenborn added a comment - - edited new patch version, hope that solves the problem! i had some issues with slf4j and my build fails when i try to install a fresh checkout with this error: ConnectionPoolDataSource.java: [39,13] error: ConnectionPoolDataSource is not abstract and does not override abstract method getParentLogger() in CommonDataSource... the new patch doesnt change anything in the pom or the ConnectionPoolDataSource...
        Dirk Weißenborn made changes -
        Attachment MAHOUT-968.patch [ 12513721 ]
        Hide
        Viktor Gal added a comment -

        patch works great, the problem is i still haven't got around getting my dataset into the shape so i can start feeding it for the RBM classifier. i hope i can still do it today, but in worst case i'll get back to you within 12 hours.

        Show
        Viktor Gal added a comment - patch works great, the problem is i still haven't got around getting my dataset into the shape so i can start feeding it for the RBM classifier. i hope i can still do it today, but in worst case i'll get back to you within 12 hours.
        Hide
        Dirk Weißenborn added a comment -

        You need to feed the label and the images file to the MnistPreparer in Mahout-Examples with size 44000 should work and chunknumber 440. then try training the network with the directory where the chunks are written. then run the training just greedy first so use "-nf" for no finetuning as option for the training. structure should be 784,500,1000 (these are the layers of the network). if you want to run it with hadoop use -mapreduce. everything else is ok with the defaults. after training greedily, the model will be saved. you can take this model to finetune later (set output path to the model, it will be materialized). use then -ng for nogreedy training. this first part of the training took me on an 8 core machine about 26 hours!! I have a trained model if you would like to have that

        Show
        Dirk Weißenborn added a comment - You need to feed the label and the images file to the MnistPreparer in Mahout-Examples with size 44000 should work and chunknumber 440. then try training the network with the directory where the chunks are written. then run the training just greedy first so use "-nf" for no finetuning as option for the training. structure should be 784,500,1000 (these are the layers of the network). if you want to run it with hadoop use -mapreduce. everything else is ok with the defaults. after training greedily, the model will be saved. you can take this model to finetune later (set output path to the model, it will be materialized). use then -ng for nogreedy training. this first part of the training took me on an 8 core machine about 26 hours!! I have a trained model if you would like to have that
        Hide
        Viktor Gal added a comment -

        MnistPreparer is for images in the MNIST data set, or? as i'd like to test it on a very different data set... but let's see how long it'll take, as the data set is quite bigger than the MNIST.

        Show
        Viktor Gal added a comment - MnistPreparer is for images in the MNIST data set, or? as i'd like to test it on a very different data set... but let's see how long it'll take, as the data set is quite bigger than the MNIST.
        Hide
        Dirk Weißenborn added a comment -

        that depends extremely on the number of training batches and the number of epochs you are using and of course on the network structure. you can try one epoch through the dataset at first and then a little finetuning maybe. If you have less input neurons it should go much faster... if you need any help for some parameter tuning, i can probably help you. if you want to monitor progress dont forget the monitor option, where you can see the reconstruction error(greedy pretraining)/discriminative error(finetuning) after each batch.

        Show
        Dirk Weißenborn added a comment - that depends extremely on the number of training batches and the number of epochs you are using and of course on the network structure. you can try one epoch through the dataset at first and then a little finetuning maybe. If you have less input neurons it should go much faster... if you need any help for some parameter tuning, i can probably help you. if you want to monitor progress dont forget the monitor option, where you can see the reconstruction error(greedy pretraining)/discriminative error(finetuning) after each batch.
        Hide
        Grant Ingersoll added a comment -

        Dirk,

        Thanks for updating this, perhaps some docs on the wiki would help those of us not as familiar check it out?

        Show
        Grant Ingersoll added a comment - Dirk, Thanks for updating this, perhaps some docs on the wiki would help those of us not as familiar check it out?
        Dirk Weißenborn made changes -
        Comment [ So how is testing going? Any bugs or problems until now? ]
        Jeff Eastman made changes -
        Fix Version/s 0.8 [ 12320153 ]
        Fix Version/s 0.7 [ 12319261 ]
        Hide
        Robin Anil added a comment -

        I can be a reviewer if you are willing to work on it

        As I see now, It requires
        A) Lot of stylistic cleanup.
        B) Lot of code structuring cleanup (no typecasting please)
        C) Tets.

        Show
        Robin Anil added a comment - I can be a reviewer if you are willing to work on it As I see now, It requires A) Lot of stylistic cleanup. B) Lot of code structuring cleanup (no typecasting please) C) Tets.
        Robin Anil made changes -
        Assignee Robin Anil [ robinanil ]
        Fix Version/s Backlog [ 12318886 ]
        Fix Version/s 0.8 [ 12320153 ]
        Hide
        Dirk Weißenborn added a comment -

        I would rather like to withdraw that patch, because by the time i implemented it i didn't know that the learning algorithm is not suited for MR, so I think there is no point including the patch. Thank you for your comments though!

        Show
        Dirk Weißenborn added a comment - I would rather like to withdraw that patch, because by the time i implemented it i didn't know that the learning algorithm is not suited for MR, so I think there is no point including the patch. Thank you for your comments though!
        Dirk Weißenborn made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Sebastian Schelter made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Won't Fix [ 2 ]
        Sebastian Schelter made changes -
        Fix Version/s Backlog [ 12318886 ]

          People

          • Assignee:
            Robin Anil
            Reporter:
            Dirk Weißenborn
          • Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - 336h
              336h
              Remaining:
              Remaining Estimate - 336h
              336h
              Logged:
              Time Spent - Not Specified
              Not Specified

                Development