Mahout
  1. Mahout
  2. MAHOUT-918

Implement SGD based classifiers using MapReduce

    Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Won't Fix
    • Affects Version/s: 0.6
    • Fix Version/s: None
    • Component/s: Classification
    • Labels:
      None

      Description

      Implement SGD based classifiers (Logistic Regression, Adaptive Logistic regression and Passive-Aggressive) using MapReduce.
      They are implemented using Iterative Parameter Mixtures algorithm which is referred to in the following papers.

      http://research.google.com/pubs/pub36948.html
      http://aclweb.org/anthology-new/N/N10/N10-1069.pdf
      http://books.nips.cc/papers/files/nips22/NIPS2009_0345.pdf

      1. design.pdf
        57 kB
        issei yoshida
      2. MAHOUT-918.patch
        56 kB
        issei yoshida

        Activity

        Hide
        issei yoshida added a comment -

        I wrote the code distributing Logistic Regression, Adaptive Logistic regression and Passive-Aggressive with MapReduce.
        I would like your comments.

        Show
        issei yoshida added a comment - I wrote the code distributing Logistic Regression, Adaptive Logistic regression and Passive-Aggressive with MapReduce. I would like your comments.
        Hide
        Ted Dunning added a comment -

        Can you post this as a review board review. There are lots of comments to be made.

        At a high level, I note the following issues:

        1) I don't see a design document. You cite a few articles but you don't say what you are really doing.

        2) Is map-reduce an appropriate approach here for model averaging?

        3) How do you plan to deal with randomization of data order?

        4) There are a number of style issues:

        a) you have loops that look like this:

                  for (...) {
                     if (something) {
                        ... stuff ...
                        continue;
                     }
                     ... other stuff ...
                     break;
                  }
        

        This is slightly perverse and is akin to using goto statements. Much better is this:

        
                  for (...) {
                     if (something) {
                        ... stuff ...
                     } else {
                        ... other stuff ...
                        break;
                     }
                  }
        
        Show
        Ted Dunning added a comment - Can you post this as a review board review. There are lots of comments to be made. At a high level, I note the following issues: 1) I don't see a design document. You cite a few articles but you don't say what you are really doing. 2) Is map-reduce an appropriate approach here for model averaging? 3) How do you plan to deal with randomization of data order? 4) There are a number of style issues: a) you have loops that look like this: for (...) { if (something) { ... stuff ... continue ; } ... other stuff ... break ; } This is slightly perverse and is akin to using goto statements. Much better is this: for (...) { if (something) { ... stuff ... } else { ... other stuff ... break ; } }
        Hide
        Ted Dunning added a comment -

        Algorithmically, simply gluing several classifiers into a map-reduce framework doesn't really change much.

        Much more interesting would be to do something along these lines:

        http://arxiv.org/pdf/1107.2490

        or this:

        http://cacm.acm.org/blogs/blog-cacm/144075-hadoop-allreduce-and-terascale-learning/fulltext

        Show
        Ted Dunning added a comment - Algorithmically, simply gluing several classifiers into a map-reduce framework doesn't really change much. Much more interesting would be to do something along these lines: http://arxiv.org/pdf/1107.2490 or this: http://cacm.acm.org/blogs/blog-cacm/144075-hadoop-allreduce-and-terascale-learning/fulltext
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/3072/
        -----------------------------------------------------------

        Review request for mahout.

        Summary
        -------

        MAHOUT-918 Parallelized SGD in MapReduce

        This addresses bug MAHOUT-918.
        https://issues.apache.org/jira/browse/MAHOUT-918

        Diffs


        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapReduceTest.java PRE-CREATION
        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapReduceTest.java PRE-CREATION
        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java PRE-CREATION
        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapReduceTest.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapper.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionReducer.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveDriver.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionDriver.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/PassiveAggressive.java 1211755

        Diff: https://reviews.apache.org/r/3072/diff

        Testing
        -------

        Thanks,

        issei

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3072/ ----------------------------------------------------------- Review request for mahout. Summary ------- MAHOUT-918 Parallelized SGD in MapReduce This addresses bug MAHOUT-918 . https://issues.apache.org/jira/browse/MAHOUT-918 Diffs trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapReduceTest.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapReduceTest.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapReduceTest.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionReducer.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/PassiveAggressive.java 1211755 Diff: https://reviews.apache.org/r/3072/diff Testing ------- Thanks, issei
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/3072/
        -----------------------------------------------------------

        (Updated 2011-12-08 06:52:01.921057)

        Review request for mahout.

        Summary
        -------

        MAHOUT-918 Parallelized SGD in MapReduce

        This addresses bug MAHOUT-918.
        https://issues.apache.org/jira/browse/MAHOUT-918

        Diffs (updated)


        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/PassiveAggressive.java 1211755
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionDriver.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapper.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionReducer.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveDriver.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java PRE-CREATION
        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java PRE-CREATION
        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapReduceTest.java PRE-CREATION
        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapReduceTest.java PRE-CREATION
        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapReduceTest.java PRE-CREATION

        Diff: https://reviews.apache.org/r/3072/diff

        Testing
        -------

        Thanks,

        issei

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3072/ ----------------------------------------------------------- (Updated 2011-12-08 06:52:01.921057) Review request for mahout. Summary ------- MAHOUT-918 Parallelized SGD in MapReduce This addresses bug MAHOUT-918 . https://issues.apache.org/jira/browse/MAHOUT-918 Diffs (updated) trunk/core/src/main/java/org/apache/mahout/classifier/sgd/PassiveAggressive.java 1211755 trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionReducer.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapReduceTest.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapReduceTest.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapReduceTest.java PRE-CREATION Diff: https://reviews.apache.org/r/3072/diff Testing ------- Thanks, issei
        Hide
        issei yoshida added a comment -

        I posted the code in the review board and attached a design document.
        https://reviews.apache.org/r/3072/

        > Is map-reduce an appropriate approach here for model averaging?
        MPI or other frameworks may produce a better result,
        but the important thing is that MapReduce implementation is easy to use for Hadoop users.
        Some iterative algorithms (K-means or other clustering algorithms) which are implemented in Mahout may not be best suitable for MapReduce, but it is not the point.

        The papers show that Iterative Parameter Mixture is the best way to distribute SGD in MapReduce.

        > How do you plan to deal with randomization of data order?
        It may be possible to randomize data order by customizing InputFormat.

        Show
        issei yoshida added a comment - I posted the code in the review board and attached a design document. https://reviews.apache.org/r/3072/ > Is map-reduce an appropriate approach here for model averaging? MPI or other frameworks may produce a better result, but the important thing is that MapReduce implementation is easy to use for Hadoop users. Some iterative algorithms (K-means or other clustering algorithms) which are implemented in Mahout may not be best suitable for MapReduce, but it is not the point. The papers show that Iterative Parameter Mixture is the best way to distribute SGD in MapReduce. > How do you plan to deal with randomization of data order? It may be possible to randomize data order by customizing InputFormat.
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/3072/#review3734
        -----------------------------------------------------------

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java
        <https://reviews.apache.org/r/3072/#comment8405>

        Needs a comment about how this works.

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java
        <https://reviews.apache.org/r/3072/#comment8406>

        This is nearly duplicated code. The mapper and reducer should share some code to avoid inconsistent defaults.

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java
        <https://reviews.apache.org/r/3072/#comment8407>

        This really need a comment. What is the purpose here?

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java
        <https://reviews.apache.org/r/3072/#comment8408>

        What is this intended to do? Why?

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java
        <https://reviews.apache.org/r/3072/#comment8403>

        Typo.

        Also, this doesn't say how this works or why it is the way it is.

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java
        <https://reviews.apache.org/r/3072/#comment8404>

        Shouldn't there be a combiner as well?

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java
        <https://reviews.apache.org/r/3072/#comment8402>

        A comment here about what this weight is would be nice. Also, how can a double be a key? That is tantamount to comparing doubles which is bad.

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java
        <https://reviews.apache.org/r/3072/#comment8400>

        Where does the InterruptedException come from?

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java
        <https://reviews.apache.org/r/3072/#comment8399>

        Use brackets

        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java
        <https://reviews.apache.org/r/3072/#comment8401>

        Should not throw Exception

        • Ted

        On 2011-12-08 06:52:01, issei yoshida wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/3072/

        -----------------------------------------------------------

        (Updated 2011-12-08 06:52:01)

        Review request for mahout.

        Summary

        -------

        MAHOUT-918 Parallelized SGD in MapReduce

        This addresses bug MAHOUT-918.

        https://issues.apache.org/jira/browse/MAHOUT-918

        Diffs

        -----

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/PassiveAggressive.java 1211755

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionDriver.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapper.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionReducer.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveDriver.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java PRE-CREATION

        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java PRE-CREATION

        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapReduceTest.java PRE-CREATION

        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapReduceTest.java PRE-CREATION

        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapReduceTest.java PRE-CREATION

        Diff: https://reviews.apache.org/r/3072/diff

        Testing

        -------

        Thanks,

        issei

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3072/#review3734 ----------------------------------------------------------- trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java < https://reviews.apache.org/r/3072/#comment8405 > Needs a comment about how this works. trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java < https://reviews.apache.org/r/3072/#comment8406 > This is nearly duplicated code. The mapper and reducer should share some code to avoid inconsistent defaults. trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java < https://reviews.apache.org/r/3072/#comment8407 > This really need a comment. What is the purpose here? trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java < https://reviews.apache.org/r/3072/#comment8408 > What is this intended to do? Why? trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java < https://reviews.apache.org/r/3072/#comment8403 > Typo. Also, this doesn't say how this works or why it is the way it is. trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java < https://reviews.apache.org/r/3072/#comment8404 > Shouldn't there be a combiner as well? trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java < https://reviews.apache.org/r/3072/#comment8402 > A comment here about what this weight is would be nice. Also, how can a double be a key? That is tantamount to comparing doubles which is bad. trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java < https://reviews.apache.org/r/3072/#comment8400 > Where does the InterruptedException come from? trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java < https://reviews.apache.org/r/3072/#comment8399 > Use brackets trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java < https://reviews.apache.org/r/3072/#comment8401 > Should not throw Exception Ted On 2011-12-08 06:52:01, issei yoshida wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3072/ ----------------------------------------------------------- (Updated 2011-12-08 06:52:01) Review request for mahout. Summary ------- MAHOUT-918 Parallelized SGD in MapReduce This addresses bug MAHOUT-918 . https://issues.apache.org/jira/browse/MAHOUT-918 Diffs ----- trunk/core/src/main/java/org/apache/mahout/classifier/sgd/PassiveAggressive.java 1211755 trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionReducer.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapReduceTest.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapReduceTest.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapReduceTest.java PRE-CREATION Diff: https://reviews.apache.org/r/3072/diff Testing ------- Thanks, issei
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/3072/
        -----------------------------------------------------------

        (Updated 2011-12-12 11:51:59.547649)

        Review request for mahout.

        Summary
        -------

        MAHOUT-918 Parallelized SGD in MapReduce

        This addresses bug MAHOUT-918.
        https://issues.apache.org/jira/browse/MAHOUT-918

        Diffs (updated)


        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/PassiveAggressive.java 1213193
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionDriver.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapper.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionReducer.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveDriver.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapper.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDReducer.java PRE-CREATION
        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java PRE-CREATION
        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapReduceTest.java PRE-CREATION
        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapReduceTest.java PRE-CREATION
        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapReduceTest.java PRE-CREATION

        Diff: https://reviews.apache.org/r/3072/diff

        Testing
        -------

        Thanks,

        issei

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3072/ ----------------------------------------------------------- (Updated 2011-12-12 11:51:59.547649) Review request for mahout. Summary ------- MAHOUT-918 Parallelized SGD in MapReduce This addresses bug MAHOUT-918 . https://issues.apache.org/jira/browse/MAHOUT-918 Diffs (updated) trunk/core/src/main/java/org/apache/mahout/classifier/sgd/PassiveAggressive.java 1213193 trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionReducer.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDReducer.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapReduceTest.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapReduceTest.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapReduceTest.java PRE-CREATION Diff: https://reviews.apache.org/r/3072/diff Testing ------- Thanks, issei
        Hide
        jiraposter@reviews.apache.org added a comment -

        On 2011-12-08 07:04:49, Ted Dunning wrote:

        > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java, line 36

        > <https://reviews.apache.org/r/3072/diff/2/?file=63195#file63195line36>

        >

        > Needs a comment about how this works.

        Added comments.

        On 2011-12-08 07:04:49, Ted Dunning wrote:

        > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java, lines 67-75

        > <https://reviews.apache.org/r/3072/diff/2/?file=63195#file63195line67>

        >

        > This really need a comment. What is the purpose here?

        Added comments.

        On 2011-12-08 07:04:49, Ted Dunning wrote:

        > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java, lines 98-111

        > <https://reviews.apache.org/r/3072/diff/2/?file=63195#file63195line98>

        >

        > What is this intended to do? Why?

        Added comments.

        On 2011-12-08 07:04:49, Ted Dunning wrote:

        > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java, line 30

        > <https://reviews.apache.org/r/3072/diff/2/?file=63196#file63196line30>

        >

        > Typo.

        >

        > Also, this doesn't say how this works or why it is the way it is.

        Fixed the typo and added comments.

        On 2011-12-08 07:04:49, Ted Dunning wrote:

        > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java, line 32

        > <https://reviews.apache.org/r/3072/diff/2/?file=63196#file63196line32>

        >

        > Shouldn't there be a combiner as well?

        A combiner isn't needed because each map task submits one value overall.

        On 2011-12-08 07:04:49, Ted Dunning wrote:

        > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java, line 53

        > <https://reviews.apache.org/r/3072/diff/2/?file=63196#file63196line53>

        >

        > A comment here about what this weight is would be nice. Also, how can a double be a key? That is tantamount to comparing doubles which is bad.

        Added comments. it is not the weight of the classifier but the weight of the weighted average.

        On 2011-12-08 07:04:49, Ted Dunning wrote:

        > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java, line 99

        > <https://reviews.apache.org/r/3072/diff/2/?file=63197#file63197line99>

        >

        > Where does the InterruptedException come from?

        It comes from runIteration function.

        On 2011-12-08 07:04:49, Ted Dunning wrote:

        > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java, lines 110-111

        > <https://reviews.apache.org/r/3072/diff/2/?file=63197#file63197line110>

        >

        > Use brackets

        Added brackets.

        On 2011-12-08 07:04:49, Ted Dunning wrote:

        > trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java, line 35

        > <https://reviews.apache.org/r/3072/diff/2/?file=63198#file63198line35>

        >

        > Should not throw Exception

        Added IO Exception and Interrupted Exception.

        On 2011-12-08 07:04:49, Ted Dunning wrote:

        > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java, lines 53-56

        > <https://reviews.apache.org/r/3072/diff/2/?file=63195#file63195line53>

        >

        > This is nearly duplicated code. The mapper and reducer should share some code to avoid inconsistent defaults.

        Created a base class which shares the same initialization code.

        • issei

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/3072/#review3734
        -----------------------------------------------------------

        On 2011-12-12 11:51:59, issei yoshida wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/3072/

        -----------------------------------------------------------

        (Updated 2011-12-12 11:51:59)

        Review request for mahout.

        Summary

        -------

        MAHOUT-918 Parallelized SGD in MapReduce

        This addresses bug MAHOUT-918.

        https://issues.apache.org/jira/browse/MAHOUT-918

        Diffs

        -----

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/PassiveAggressive.java 1213193

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionDriver.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapper.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionReducer.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveDriver.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapper.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDReducer.java PRE-CREATION

        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java PRE-CREATION

        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapReduceTest.java PRE-CREATION

        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapReduceTest.java PRE-CREATION

        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapReduceTest.java PRE-CREATION

        Diff: https://reviews.apache.org/r/3072/diff

        Testing

        -------

        Thanks,

        issei

        Show
        jiraposter@reviews.apache.org added a comment - On 2011-12-08 07:04:49, Ted Dunning wrote: > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java, line 36 > < https://reviews.apache.org/r/3072/diff/2/?file=63195#file63195line36 > > > Needs a comment about how this works. Added comments. On 2011-12-08 07:04:49, Ted Dunning wrote: > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java, lines 67-75 > < https://reviews.apache.org/r/3072/diff/2/?file=63195#file63195line67 > > > This really need a comment. What is the purpose here? Added comments. On 2011-12-08 07:04:49, Ted Dunning wrote: > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java, lines 98-111 > < https://reviews.apache.org/r/3072/diff/2/?file=63195#file63195line98 > > > What is this intended to do? Why? Added comments. On 2011-12-08 07:04:49, Ted Dunning wrote: > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java, line 30 > < https://reviews.apache.org/r/3072/diff/2/?file=63196#file63196line30 > > > Typo. > > Also, this doesn't say how this works or why it is the way it is. Fixed the typo and added comments. On 2011-12-08 07:04:49, Ted Dunning wrote: > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java, line 32 > < https://reviews.apache.org/r/3072/diff/2/?file=63196#file63196line32 > > > Shouldn't there be a combiner as well? A combiner isn't needed because each map task submits one value overall. On 2011-12-08 07:04:49, Ted Dunning wrote: > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java, line 53 > < https://reviews.apache.org/r/3072/diff/2/?file=63196#file63196line53 > > > A comment here about what this weight is would be nice. Also, how can a double be a key? That is tantamount to comparing doubles which is bad. Added comments. it is not the weight of the classifier but the weight of the weighted average. On 2011-12-08 07:04:49, Ted Dunning wrote: > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java, line 99 > < https://reviews.apache.org/r/3072/diff/2/?file=63197#file63197line99 > > > Where does the InterruptedException come from? It comes from runIteration function. On 2011-12-08 07:04:49, Ted Dunning wrote: > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java, lines 110-111 > < https://reviews.apache.org/r/3072/diff/2/?file=63197#file63197line110 > > > Use brackets Added brackets. On 2011-12-08 07:04:49, Ted Dunning wrote: > trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java, line 35 > < https://reviews.apache.org/r/3072/diff/2/?file=63198#file63198line35 > > > Should not throw Exception Added IO Exception and Interrupted Exception. On 2011-12-08 07:04:49, Ted Dunning wrote: > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java, lines 53-56 > < https://reviews.apache.org/r/3072/diff/2/?file=63195#file63195line53 > > > This is nearly duplicated code. The mapper and reducer should share some code to avoid inconsistent defaults. Created a base class which shares the same initialization code. issei ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3072/#review3734 ----------------------------------------------------------- On 2011-12-12 11:51:59, issei yoshida wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3072/ ----------------------------------------------------------- (Updated 2011-12-12 11:51:59) Review request for mahout. Summary ------- MAHOUT-918 Parallelized SGD in MapReduce This addresses bug MAHOUT-918 . https://issues.apache.org/jira/browse/MAHOUT-918 Diffs ----- trunk/core/src/main/java/org/apache/mahout/classifier/sgd/PassiveAggressive.java 1213193 trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionReducer.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDReducer.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapReduceTest.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapReduceTest.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapReduceTest.java PRE-CREATION Diff: https://reviews.apache.org/r/3072/diff Testing ------- Thanks, issei
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/3072/#review3869
        -----------------------------------------------------------

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java
        <https://reviews.apache.org/r/3072/#comment8694>

        This is a useless comment. The name says the same thing. Just putting in comments like this to satisfy a request for comments is very frustrating behavior.

        WHAT DOES THIS CODE INTEND TO DO AND WHY?

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java
        <https://reviews.apache.org/r/3072/#comment8695>

        Same comment. This is not a comment. This is a repetition. It adds nothing and shouldn't be here.

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java
        <https://reviews.apache.org/r/3072/#comment8696>

        HOW AND WHY?

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java
        <https://reviews.apache.org/r/3072/#comment8697>

        How does the updat ework?

        How is the request for using the final iteration as initial weights made?

        Why does it work this way?

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java
        <https://reviews.apache.org/r/3072/#comment8698>

        Iterations of what?

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java
        <https://reviews.apache.org/r/3072/#comment8699>

        This is another non-comment.

        • Ted

        On 2011-12-12 11:51:59, issei yoshida wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/3072/

        -----------------------------------------------------------

        (Updated 2011-12-12 11:51:59)

        Review request for mahout.

        Summary

        -------

        MAHOUT-918 Parallelized SGD in MapReduce

        This addresses bug MAHOUT-918.

        https://issues.apache.org/jira/browse/MAHOUT-918

        Diffs

        -----

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/PassiveAggressive.java 1213193

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionDriver.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapper.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionReducer.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveDriver.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapper.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDReducer.java PRE-CREATION

        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java PRE-CREATION

        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapReduceTest.java PRE-CREATION

        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapReduceTest.java PRE-CREATION

        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapReduceTest.java PRE-CREATION

        Diff: https://reviews.apache.org/r/3072/diff

        Testing

        -------

        Thanks,

        issei

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3072/#review3869 ----------------------------------------------------------- trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java < https://reviews.apache.org/r/3072/#comment8694 > This is a useless comment. The name says the same thing. Just putting in comments like this to satisfy a request for comments is very frustrating behavior. WHAT DOES THIS CODE INTEND TO DO AND WHY? trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java < https://reviews.apache.org/r/3072/#comment8695 > Same comment. This is not a comment. This is a repetition. It adds nothing and shouldn't be here. trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java < https://reviews.apache.org/r/3072/#comment8696 > HOW AND WHY? trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java < https://reviews.apache.org/r/3072/#comment8697 > How does the updat ework? How is the request for using the final iteration as initial weights made? Why does it work this way? trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java < https://reviews.apache.org/r/3072/#comment8698 > Iterations of what? trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java < https://reviews.apache.org/r/3072/#comment8699 > This is another non-comment. Ted On 2011-12-12 11:51:59, issei yoshida wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3072/ ----------------------------------------------------------- (Updated 2011-12-12 11:51:59) Review request for mahout. Summary ------- MAHOUT-918 Parallelized SGD in MapReduce This addresses bug MAHOUT-918 . https://issues.apache.org/jira/browse/MAHOUT-918 Diffs ----- trunk/core/src/main/java/org/apache/mahout/classifier/sgd/PassiveAggressive.java 1213193 trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionReducer.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDReducer.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapReduceTest.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapReduceTest.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapReduceTest.java PRE-CREATION Diff: https://reviews.apache.org/r/3072/diff Testing ------- Thanks, issei
        Hide
        jiraposter@reviews.apache.org added a comment -

        On 2011-12-08 07:04:49, Ted Dunning wrote:

        >

        This code got worse with these comments, not better.

        • Ted

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/3072/#review3734
        -----------------------------------------------------------

        On 2011-12-12 11:51:59, issei yoshida wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/3072/

        -----------------------------------------------------------

        (Updated 2011-12-12 11:51:59)

        Review request for mahout.

        Summary

        -------

        MAHOUT-918 Parallelized SGD in MapReduce

        This addresses bug MAHOUT-918.

        https://issues.apache.org/jira/browse/MAHOUT-918

        Diffs

        -----

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/PassiveAggressive.java 1213193

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionDriver.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapper.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionReducer.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveDriver.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapper.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDReducer.java PRE-CREATION

        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java PRE-CREATION

        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapReduceTest.java PRE-CREATION

        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapReduceTest.java PRE-CREATION

        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapReduceTest.java PRE-CREATION

        Diff: https://reviews.apache.org/r/3072/diff

        Testing

        -------

        Thanks,

        issei

        Show
        jiraposter@reviews.apache.org added a comment - On 2011-12-08 07:04:49, Ted Dunning wrote: > This code got worse with these comments, not better. Ted ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3072/#review3734 ----------------------------------------------------------- On 2011-12-12 11:51:59, issei yoshida wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3072/ ----------------------------------------------------------- (Updated 2011-12-12 11:51:59) Review request for mahout. Summary ------- MAHOUT-918 Parallelized SGD in MapReduce This addresses bug MAHOUT-918 . https://issues.apache.org/jira/browse/MAHOUT-918 Diffs ----- trunk/core/src/main/java/org/apache/mahout/classifier/sgd/PassiveAggressive.java 1213193 trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionReducer.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDReducer.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapReduceTest.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapReduceTest.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapReduceTest.java PRE-CREATION Diff: https://reviews.apache.org/r/3072/diff Testing ------- Thanks, issei
        Hide
        jiraposter@reviews.apache.org added a comment -

        On 2011-12-08 07:04:49, Ted Dunning wrote:

        >

        Ted Dunning wrote:

        This code got worse with these comments, not better.

        Would you mind reviewing Diff revision 3?
        You still seems to look at revision 2.

        • issei

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/3072/#review3734
        -----------------------------------------------------------

        On 2011-12-12 11:51:59, issei yoshida wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/3072/

        -----------------------------------------------------------

        (Updated 2011-12-12 11:51:59)

        Review request for mahout.

        Summary

        -------

        MAHOUT-918 Parallelized SGD in MapReduce

        This addresses bug MAHOUT-918.

        https://issues.apache.org/jira/browse/MAHOUT-918

        Diffs

        -----

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/PassiveAggressive.java 1213193

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionDriver.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapper.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionReducer.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveDriver.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapper.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDReducer.java PRE-CREATION

        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java PRE-CREATION

        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapReduceTest.java PRE-CREATION

        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapReduceTest.java PRE-CREATION

        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapReduceTest.java PRE-CREATION

        Diff: https://reviews.apache.org/r/3072/diff

        Testing

        -------

        Thanks,

        issei

        Show
        jiraposter@reviews.apache.org added a comment - On 2011-12-08 07:04:49, Ted Dunning wrote: > Ted Dunning wrote: This code got worse with these comments, not better. Would you mind reviewing Diff revision 3? You still seems to look at revision 2. issei ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3072/#review3734 ----------------------------------------------------------- On 2011-12-12 11:51:59, issei yoshida wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3072/ ----------------------------------------------------------- (Updated 2011-12-12 11:51:59) Review request for mahout. Summary ------- MAHOUT-918 Parallelized SGD in MapReduce This addresses bug MAHOUT-918 . https://issues.apache.org/jira/browse/MAHOUT-918 Diffs ----- trunk/core/src/main/java/org/apache/mahout/classifier/sgd/PassiveAggressive.java 1213193 trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionReducer.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDReducer.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapReduceTest.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapReduceTest.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapReduceTest.java PRE-CREATION Diff: https://reviews.apache.org/r/3072/diff Testing ------- Thanks, issei
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/3072/
        -----------------------------------------------------------

        (Updated 2011-12-13 07:32:38.895973)

        Review request for mahout.

        Summary
        -------

        MAHOUT-918 Parallelized SGD in MapReduce

        This addresses bug MAHOUT-918.
        https://issues.apache.org/jira/browse/MAHOUT-918

        Diffs (updated)


        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/PassiveAggressive.java 1213193
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionDriver.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapper.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionReducer.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveDriver.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapper.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDReducer.java PRE-CREATION
        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java PRE-CREATION
        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapReduceTest.java PRE-CREATION
        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapReduceTest.java PRE-CREATION
        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapReduceTest.java PRE-CREATION

        Diff: https://reviews.apache.org/r/3072/diff

        Testing
        -------

        Thanks,

        issei

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3072/ ----------------------------------------------------------- (Updated 2011-12-13 07:32:38.895973) Review request for mahout. Summary ------- MAHOUT-918 Parallelized SGD in MapReduce This addresses bug MAHOUT-918 . https://issues.apache.org/jira/browse/MAHOUT-918 Diffs (updated) trunk/core/src/main/java/org/apache/mahout/classifier/sgd/PassiveAggressive.java 1213193 trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionReducer.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDReducer.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapReduceTest.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapReduceTest.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapReduceTest.java PRE-CREATION Diff: https://reviews.apache.org/r/3072/diff Testing ------- Thanks, issei
        Hide
        jiraposter@reviews.apache.org added a comment -

        On 2011-12-08 07:04:49, Ted Dunning wrote:

        >

        Ted Dunning wrote:

        This code got worse with these comments, not better.

        issei yoshida wrote:

        Would you mind reviewing Diff revision 3?

        You still seems to look at revision 2.

        Updated Diff revision 4 where I add some comments,
        so please see revision 4.
        https://reviews.apache.org/r/3072/diff/

        • issei

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/3072/#review3734
        -----------------------------------------------------------

        On 2011-12-13 07:32:38, issei yoshida wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/3072/

        -----------------------------------------------------------

        (Updated 2011-12-13 07:32:38)

        Review request for mahout.

        Summary

        -------

        MAHOUT-918 Parallelized SGD in MapReduce

        This addresses bug MAHOUT-918.

        https://issues.apache.org/jira/browse/MAHOUT-918

        Diffs

        -----

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/PassiveAggressive.java 1213193

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionDriver.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapper.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionReducer.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveDriver.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapper.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDReducer.java PRE-CREATION

        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java PRE-CREATION

        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapReduceTest.java PRE-CREATION

        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapReduceTest.java PRE-CREATION

        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapReduceTest.java PRE-CREATION

        Diff: https://reviews.apache.org/r/3072/diff

        Testing

        -------

        Thanks,

        issei

        Show
        jiraposter@reviews.apache.org added a comment - On 2011-12-08 07:04:49, Ted Dunning wrote: > Ted Dunning wrote: This code got worse with these comments, not better. issei yoshida wrote: Would you mind reviewing Diff revision 3? You still seems to look at revision 2. Updated Diff revision 4 where I add some comments, so please see revision 4. https://reviews.apache.org/r/3072/diff/ issei ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3072/#review3734 ----------------------------------------------------------- On 2011-12-13 07:32:38, issei yoshida wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3072/ ----------------------------------------------------------- (Updated 2011-12-13 07:32:38) Review request for mahout. Summary ------- MAHOUT-918 Parallelized SGD in MapReduce This addresses bug MAHOUT-918 . https://issues.apache.org/jira/browse/MAHOUT-918 Diffs ----- trunk/core/src/main/java/org/apache/mahout/classifier/sgd/PassiveAggressive.java 1213193 trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionReducer.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDReducer.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapReduceTest.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapReduceTest.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapReduceTest.java PRE-CREATION Diff: https://reviews.apache.org/r/3072/diff Testing ------- Thanks, issei
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/3072/#review3875
        -----------------------------------------------------------

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java
        <https://reviews.apache.org/r/3072/#comment8703>

        Direct and exact quotes from the paper should be either avoided or acknowledged. Better here to rephrase the language.

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java
        <https://reviews.apache.org/r/3072/#comment8704>

        Again, just quoting the paper is not a good idea. This isn't adding any information in any case since the exact same language was used in the class level java doc.

        It would be nice here to note that the average is an unweighted average.

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java
        <https://reviews.apache.org/r/3072/#comment8705>

        I don't think that this is correct. Is this really what the output is? Why are you dividing by a weight vector? How do you compute this score?

        Or do you mean to not divide here?

        If so, why do you use a score as the key?

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java
        <https://reviews.apache.org/r/3072/#comment8706>

        This looks like a bad key to use here.

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java
        <https://reviews.apache.org/r/3072/#comment8707>

        I don't think that this is correct. In the google paper, the average was unweighted. In any case how do you compute this score for weighting?

        Also, if the key is the score, how does the reducer work since each reduce function will only see one score? Are you assuming that there is exactly one reducer?

        • Ted

        On 2011-12-13 07:32:38, issei yoshida wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/3072/

        -----------------------------------------------------------

        (Updated 2011-12-13 07:32:38)

        Review request for mahout.

        Summary

        -------

        MAHOUT-918 Parallelized SGD in MapReduce

        This addresses bug MAHOUT-918.

        https://issues.apache.org/jira/browse/MAHOUT-918

        Diffs

        -----

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/PassiveAggressive.java 1213193

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionDriver.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapper.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionReducer.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveDriver.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapper.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDReducer.java PRE-CREATION

        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java PRE-CREATION

        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapReduceTest.java PRE-CREATION

        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapReduceTest.java PRE-CREATION

        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapReduceTest.java PRE-CREATION

        Diff: https://reviews.apache.org/r/3072/diff

        Testing

        -------

        Thanks,

        issei

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3072/#review3875 ----------------------------------------------------------- trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java < https://reviews.apache.org/r/3072/#comment8703 > Direct and exact quotes from the paper should be either avoided or acknowledged. Better here to rephrase the language. trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java < https://reviews.apache.org/r/3072/#comment8704 > Again, just quoting the paper is not a good idea. This isn't adding any information in any case since the exact same language was used in the class level java doc. It would be nice here to note that the average is an unweighted average. trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java < https://reviews.apache.org/r/3072/#comment8705 > I don't think that this is correct. Is this really what the output is? Why are you dividing by a weight vector? How do you compute this score? Or do you mean to not divide here? If so, why do you use a score as the key? trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java < https://reviews.apache.org/r/3072/#comment8706 > This looks like a bad key to use here. trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java < https://reviews.apache.org/r/3072/#comment8707 > I don't think that this is correct. In the google paper, the average was unweighted. In any case how do you compute this score for weighting? Also, if the key is the score, how does the reducer work since each reduce function will only see one score? Are you assuming that there is exactly one reducer? Ted On 2011-12-13 07:32:38, issei yoshida wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3072/ ----------------------------------------------------------- (Updated 2011-12-13 07:32:38) Review request for mahout. Summary ------- MAHOUT-918 Parallelized SGD in MapReduce This addresses bug MAHOUT-918 . https://issues.apache.org/jira/browse/MAHOUT-918 Diffs ----- trunk/core/src/main/java/org/apache/mahout/classifier/sgd/PassiveAggressive.java 1213193 trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionReducer.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDReducer.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapReduceTest.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapReduceTest.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapReduceTest.java PRE-CREATION Diff: https://reviews.apache.org/r/3072/diff Testing ------- Thanks, issei
        Hide
        jiraposter@reviews.apache.org added a comment -

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/3072/
        -----------------------------------------------------------

        (Updated 2011-12-14 08:59:29.074032)

        Review request for mahout.

        Summary
        -------

        MAHOUT-918 Parallelized SGD in MapReduce

        This addresses bug MAHOUT-918.
        https://issues.apache.org/jira/browse/MAHOUT-918

        Diffs (updated)


        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/PassiveAggressive.java 1214116
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionDriver.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapper.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionReducer.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveDriver.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapper.java PRE-CREATION
        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDReducer.java PRE-CREATION
        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java PRE-CREATION
        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapReduceTest.java PRE-CREATION
        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapReduceTest.java PRE-CREATION
        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapReduceTest.java PRE-CREATION

        Diff: https://reviews.apache.org/r/3072/diff

        Testing
        -------

        Thanks,

        issei

        Show
        jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3072/ ----------------------------------------------------------- (Updated 2011-12-14 08:59:29.074032) Review request for mahout. Summary ------- MAHOUT-918 Parallelized SGD in MapReduce This addresses bug MAHOUT-918 . https://issues.apache.org/jira/browse/MAHOUT-918 Diffs (updated) trunk/core/src/main/java/org/apache/mahout/classifier/sgd/PassiveAggressive.java 1214116 trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionReducer.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDReducer.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapReduceTest.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapReduceTest.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapReduceTest.java PRE-CREATION Diff: https://reviews.apache.org/r/3072/diff Testing ------- Thanks, issei
        Hide
        jiraposter@reviews.apache.org added a comment -

        On 2011-12-13 13:24:28, Ted Dunning wrote:

        > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java, lines 36-41

        > <https://reviews.apache.org/r/3072/diff/4/?file=64283#file64283line36>

        >

        > Direct and exact quotes from the paper should be either avoided or acknowledged. Better here to rephrase the language.

        Rephrased the language at revision 5.

        On 2011-12-13 13:24:28, Ted Dunning wrote:

        > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java, lines 60-63

        > <https://reviews.apache.org/r/3072/diff/4/?file=64283#file64283line60>

        >

        > Again, just quoting the paper is not a good idea. This isn't adding any information in any case since the exact same language was used in the class level java doc.

        >

        > It would be nice here to note that the average is an unweighted average.

        Rephrased the language at revision 5.

        On 2011-12-13 13:24:28, Ted Dunning wrote:

        > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java, lines 87-88

        > <https://reviews.apache.org/r/3072/diff/4/?file=64284#file64284line87>

        >

        > This looks like a bad key to use here.

        This key should be the average of log-likelihood of the best OnlineLogisticRegression in AdaptiveLogisticRegression.

        On 2011-12-13 13:24:28, Ted Dunning wrote:

        > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java, line 40

        > <https://reviews.apache.org/r/3072/diff/4/?file=64284#file64284line40>

        >

        > I don't think that this is correct. Is this really what the output is? Why are you dividing by a weight vector? How do you compute this score?

        >

        > Or do you mean to not divide here?

        >

        > If so, why do you use a score as the key?

        The way to explain it may be bad, but it means the Map output key is score and Map output value is new weight vector.
        I rewrote the comment at revision 5.

        On 2011-12-13 13:24:28, Ted Dunning wrote:

        > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java, lines 34-35

        > <https://reviews.apache.org/r/3072/diff/4/?file=64285#file64285line34>

        >

        > I don't think that this is correct. In the google paper, the average was unweighted. In any case how do you compute this score for weighting?

        >

        > Also, if the key is the score, how does the reducer work since each reduce function will only see one score? Are you assuming that there is exactly one reducer?

        The original paper(http://aclweb.org/anthology-new/N/N10/N10-1069.pdf) says it is a weighted average,
        but my simple experiment showed that the unweighted average was better than the weighted average.
        So I rewrote the code as the unweighted average at revision 5.

        The number of reducers should be set to one. I added the comment accordingly at revision 5.
        The number of reducers is set at runIteration function at Driver class.

        • issei

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        https://reviews.apache.org/r/3072/#review3875
        -----------------------------------------------------------

        On 2011-12-14 08:59:29, issei yoshida wrote:

        -----------------------------------------------------------

        This is an automatically generated e-mail. To reply, visit:

        https://reviews.apache.org/r/3072/

        -----------------------------------------------------------

        (Updated 2011-12-14 08:59:29)

        Review request for mahout.

        Summary

        -------

        MAHOUT-918 Parallelized SGD in MapReduce

        This addresses bug MAHOUT-918.

        https://issues.apache.org/jira/browse/MAHOUT-918

        Diffs

        -----

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/PassiveAggressive.java 1214116

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionDriver.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapper.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionReducer.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveDriver.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapper.java PRE-CREATION

        trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDReducer.java PRE-CREATION

        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java PRE-CREATION

        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapReduceTest.java PRE-CREATION

        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapReduceTest.java PRE-CREATION

        trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapReduceTest.java PRE-CREATION

        Diff: https://reviews.apache.org/r/3072/diff

        Testing

        -------

        Thanks,

        issei

        Show
        jiraposter@reviews.apache.org added a comment - On 2011-12-13 13:24:28, Ted Dunning wrote: > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java, lines 36-41 > < https://reviews.apache.org/r/3072/diff/4/?file=64283#file64283line36 > > > Direct and exact quotes from the paper should be either avoided or acknowledged. Better here to rephrase the language. Rephrased the language at revision 5. On 2011-12-13 13:24:28, Ted Dunning wrote: > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java, lines 60-63 > < https://reviews.apache.org/r/3072/diff/4/?file=64283#file64283line60 > > > Again, just quoting the paper is not a good idea. This isn't adding any information in any case since the exact same language was used in the class level java doc. > > It would be nice here to note that the average is an unweighted average. Rephrased the language at revision 5. On 2011-12-13 13:24:28, Ted Dunning wrote: > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java, lines 87-88 > < https://reviews.apache.org/r/3072/diff/4/?file=64284#file64284line87 > > > This looks like a bad key to use here. This key should be the average of log-likelihood of the best OnlineLogisticRegression in AdaptiveLogisticRegression. On 2011-12-13 13:24:28, Ted Dunning wrote: > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java, line 40 > < https://reviews.apache.org/r/3072/diff/4/?file=64284#file64284line40 > > > I don't think that this is correct. Is this really what the output is? Why are you dividing by a weight vector? How do you compute this score? > > Or do you mean to not divide here? > > If so, why do you use a score as the key? The way to explain it may be bad, but it means the Map output key is score and Map output value is new weight vector. I rewrote the comment at revision 5. On 2011-12-13 13:24:28, Ted Dunning wrote: > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java, lines 34-35 > < https://reviews.apache.org/r/3072/diff/4/?file=64285#file64285line34 > > > I don't think that this is correct. In the google paper, the average was unweighted. In any case how do you compute this score for weighting? > > Also, if the key is the score, how does the reducer work since each reduce function will only see one score? Are you assuming that there is exactly one reducer? The original paper( http://aclweb.org/anthology-new/N/N10/N10-1069.pdf ) says it is a weighted average, but my simple experiment showed that the unweighted average was better than the weighted average. So I rewrote the code as the unweighted average at revision 5. The number of reducers should be set to one. I added the comment accordingly at revision 5. The number of reducers is set at runIteration function at Driver class. issei ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3072/#review3875 ----------------------------------------------------------- On 2011-12-14 08:59:29, issei yoshida wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3072/ ----------------------------------------------------------- (Updated 2011-12-14 08:59:29) Review request for mahout. Summary ------- MAHOUT-918 Parallelized SGD in MapReduce This addresses bug MAHOUT-918 . https://issues.apache.org/jira/browse/MAHOUT-918 Diffs ----- trunk/core/src/main/java/org/apache/mahout/classifier/sgd/PassiveAggressive.java 1214116 trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionReducer.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapper.java PRE-CREATION trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDReducer.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapReduceTest.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapReduceTest.java PRE-CREATION trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapReduceTest.java PRE-CREATION Diff: https://reviews.apache.org/r/3072/diff Testing ------- Thanks, issei
        Hide
        Robin Anil added a comment -

        A lot of good progress on the review board and then silence. issei yoshida Can you revive this and work on it for the next release.

        Show
        Robin Anil added a comment - A lot of good progress on the review board and then silence. issei yoshida Can you revive this and work on it for the next release.

          People

          • Assignee:
            Ted Dunning
            Reporter:
            issei yoshida
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development