For some learning tasks, we need the input to be randomized (SGD) instead of blocks of labels all at once. SplitInput is a useful tool for setting up train/test files but it currently doesn't support randomizing the input.
|Field||Original Value||New Value|
|Attachment||MAHOUT-904.patch [ 12506712 ]|
Remi Melisson made changes -
I had a look on it too, and one question remains :
Do we need to randomize all the set (training and test) or only the training data ?
@Raphael Let me know if you already started, because I planned to begin dev soon. ]
|Assignee||Grant Ingersoll [ gsingers ]||Raphael Cendrillon [ cendrillon ]|
|Status||Open [ 1 ]||Patch Available [ 10002 ]|
|Fix Version/s||0.6 [ 12316364 ]|
|Status||Patch Available [ 10002 ]||Resolved [ 5 ]|
|Resolution||Fixed [ 1 ]|
Sean Owen made changes -
|Status||Resolved [ 5 ]||Closed [ 6 ]|