Mahout
  1. Mahout
  2. MAHOUT-702

Implement Online Passive Aggressive learner

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 0.6
    • Fix Version/s: 0.6
    • Component/s: Classification
    • Labels:
      None

      Description

      Implements online passive aggressive learner that minimizes label ranking loss.

      1. MAHOUT-702.patch
        20 kB
        Hector Yee
      2. MAHOUT-702.patch
        8 kB
        Hector Yee

        Activity

        Hector Yee created issue -
        Hide
        Hector Yee added a comment -

        Implementation and unit test for passive aggressive.

        Show
        Hector Yee added a comment - Implementation and unit test for passive aggressive.
        Hector Yee made changes -
        Field Original Value New Value
        Attachment MAHOUT-702.patch [ 12479598 ]
        Hector Yee made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Hide
        Ted Dunning added a comment -

        Nice work Hector.

        I have a few comments.

        First, doesn't your train method destroy the input vector? That seems like bad manners. I think that you can get the
        effect you want without an additional copy being made by accumulating into the two rows of the weights matrix.

        Secondly, I see why you put the test into the existing test so that you could re-use some framework.

        My preference is to keep a bit of separation, however. What do you think about factoring out the
        common structure and having both kinds of test extend the same abstract class?

        Also, does your PA learner have any regularization other than early stopping? What about annealing
        of the learning rate?

        Finally, what do you think about putting this under a similar framework as AdaptiveLogisticRegression
        in order to get auto-tuning of the learning rate?

        Show
        Ted Dunning added a comment - Nice work Hector. I have a few comments. First, doesn't your train method destroy the input vector? That seems like bad manners. I think that you can get the effect you want without an additional copy being made by accumulating into the two rows of the weights matrix. Secondly, I see why you put the test into the existing test so that you could re-use some framework. My preference is to keep a bit of separation, however. What do you think about factoring out the common structure and having both kinds of test extend the same abstract class? Also, does your PA learner have any regularization other than early stopping? What about annealing of the learning rate? Finally, what do you think about putting this under a similar framework as AdaptiveLogisticRegression in order to get auto-tuning of the learning rate?
        Hide
        Hector Yee added a comment -

        For #1 I thought I made a copy but I guess its still a reference from the Java point of view. I'll change it.

        #2 sure I'll add a new unit test

        It doesn't have regularization or annealing, the hinge loss just stops learning (gradient = 0) if it gets things correct. I've just saved the ongoing best performer on a holdout set in lieu of annealing the learning rate. I would prefer to do this and the auto-tuning in a different patch if thats ok?

        Show
        Hector Yee added a comment - For #1 I thought I made a copy but I guess its still a reference from the Java point of view. I'll change it. #2 sure I'll add a new unit test It doesn't have regularization or annealing, the hinge loss just stops learning (gradient = 0) if it gets things correct. I've just saved the ongoing best performer on a holdout set in lieu of annealing the learning rate. I would prefer to do this and the auto-tuning in a different patch if thats ok?
        Hide
        Hector Yee added a comment -
        • fixed bug in stomping instance
        • factored out online test case
        • added unit test for PA
        Show
        Hector Yee added a comment - fixed bug in stomping instance factored out online test case added unit test for PA
        Hector Yee made changes -
        Attachment MAHOUT-702.patch [ 12479711 ]
        Hide
        Ted Dunning added a comment -

        This is more important than it looks since it represents an important generalization
        of the classifier framework. It is the thin end of the wedge.

        Show
        Ted Dunning added a comment - This is more important than it looks since it represents an important generalization of the classifier framework. It is the thin end of the wedge.
        Ted Dunning made changes -
        Fix Version/s 0.6 [ 12316364 ]
        Hide
        Hector Yee added a comment -

        Is this good to go?

        Show
        Hector Yee added a comment - Is this good to go?
        Hide
        Sean Owen added a comment -

        On the grounds that it looks good, looks like it refactors and imitates similar code, has tests, has had a look from Ted – going to commit it with some style tweaks.

        Show
        Sean Owen added a comment - On the grounds that it looks good, looks like it refactors and imitates similar code, has tests, has had a look from Ted – going to commit it with some style tweaks.
        Sean Owen made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Assignee Ted Dunning [ tdunning ]
        Resolution Fixed [ 1 ]
        Hide
        Hudson added a comment -

        Integrated in Mahout-Quality #861 (See https://builds.apache.org/hudson/job/Mahout-Quality/861/)
        MAHOUT-702 add passive-aggressive learner

        srowen : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1131476
        Files :

        • /mahout/trunk/core/src/main/java/org/apache/mahout/classifier/sgd/PassiveAggressive.java
        • /mahout/trunk/core/src/test/java/org/apache/mahout/classifier/sgd/OnlineBaseTest.java
        • /mahout/trunk/core/src/test/java/org/apache/mahout/classifier/sgd/OnlineLogisticRegressionTest.java
        • /mahout/trunk/core/src/test/java/org/apache/mahout/classifier/sgd/PassiveAggressiveTest.java
        Show
        Hudson added a comment - Integrated in Mahout-Quality #861 (See https://builds.apache.org/hudson/job/Mahout-Quality/861/ ) MAHOUT-702 add passive-aggressive learner srowen : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1131476 Files : /mahout/trunk/core/src/main/java/org/apache/mahout/classifier/sgd/PassiveAggressive.java /mahout/trunk/core/src/test/java/org/apache/mahout/classifier/sgd/OnlineBaseTest.java /mahout/trunk/core/src/test/java/org/apache/mahout/classifier/sgd/OnlineLogisticRegressionTest.java /mahout/trunk/core/src/test/java/org/apache/mahout/classifier/sgd/PassiveAggressiveTest.java
        Sean Owen made changes -
        Status Resolved [ 5 ] Closed [ 6 ]

          People

          • Assignee:
            Ted Dunning
            Reporter:
            Hector Yee
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - 24h
              24h
              Remaining:
              Remaining Estimate - 24h
              24h
              Logged:
              Time Spent - Not Specified
              Not Specified

                Development