Mahout
  1. Mahout
  2. MAHOUT-904

SplitInput should support randomizing the input

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.6
    • Component/s: None

      Description

      For some learning tasks, we need the input to be randomized (SGD) instead of blocks of labels all at once. SplitInput is a useful tool for setting up train/test files but it currently doesn't support randomizing the input.

      1. MAHOUT-904.patch
        48 kB
        Grant Ingersoll
      2. MAHOUT-904.patch
        48 kB
        Grant Ingersoll
      3. MAHOUT-904.patch
        43 kB
        Raphael Cendrillon
      4. MAHOUT-904.patch
        42 kB
        Grant Ingersoll
      5. MAHOUT-904.patch
        39 kB
        Grant Ingersoll
      6. MAHOUT-904.patch
        32 kB
        Raphael Cendrillon
      7. MAHOUT-904.patch
        14 kB
        Grant Ingersoll
      8. MAHOUT-904.patch
        11 kB
        Raphael Cendrillon
      9. MAHOUT-904.patch
        8 kB
        Raphael Cendrillon

        Activity

        Grant Ingersoll created issue -
        Grant Ingersoll made changes -
        Field Original Value New Value
        Labels MAHOUT_INTRO_CONTRIBUTE
        Raphael Cendrillon made changes -
        Attachment MAHOUT-904.patch [ 12506712 ]
        Raphael Cendrillon made changes -
        Attachment MAHOUT-904.patch [ 12506713 ]
        Raphael Cendrillon made changes -
        Attachment MAHOUT-904.patch [ 12506712 ]
        Remi Melisson made changes -
        Comment [ Hi,
        I had a look on it too, and one question remains :
        Do we need to randomize all the set (training and test) or only the training data ?

        @Raphael Let me know if you already started, because I planned to begin dev soon. ]
        Raphael Cendrillon made changes -
        Assignee Grant Ingersoll [ gsingers ] Raphael Cendrillon [ cendrillon ]
        Raphael Cendrillon made changes -
        Attachment MAHOUT-904.patch [ 12507841 ]
        Grant Ingersoll made changes -
        Attachment MAHOUT-904.patch [ 12507859 ]
        Raphael Cendrillon made changes -
        Attachment MAHOUT-904.patch [ 12508346 ]
        Raphael Cendrillon made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Grant Ingersoll made changes -
        Attachment MAHOUT-904.patch [ 12508397 ]
        Grant Ingersoll made changes -
        Attachment MAHOUT-904.patch [ 12508485 ]
        Raphael Cendrillon made changes -
        Attachment MAHOUT-904.patch [ 12508579 ]
        Grant Ingersoll made changes -
        Attachment MAHOUT-904.patch [ 12508632 ]
        Grant Ingersoll made changes -
        Fix Version/s 0.6 [ 12316364 ]
        Grant Ingersoll made changes -
        Attachment MAHOUT-904.patch [ 12508802 ]
        Raphael Cendrillon made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Sean Owen made changes -
        Status Resolved [ 5 ] Closed [ 6 ]

          People

          • Assignee:
            Raphael Cendrillon
            Reporter:
            Grant Ingersoll
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development