Nice work Hector.
I have a few comments.
First, doesn't your train method destroy the input vector? That seems like bad manners. I think that you can get the
effect you want without an additional copy being made by accumulating into the two rows of the weights matrix.
Secondly, I see why you put the test into the existing test so that you could re-use some framework.
My preference is to keep a bit of separation, however. What do you think about factoring out the
common structure and having both kinds of test extend the same abstract class?
Also, does your PA learner have any regularization other than early stopping? What about annealing
of the learning rate?
Finally, what do you think about putting this under a similar framework as AdaptiveLogisticRegression
in order to get auto-tuning of the learning rate?