Uploaded image for project: 'Mahout'
  1. Mahout
  2. MAHOUT-286

Need to be able to run classifiers from non-text input (such as ARFF data)

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Duplicate
    • 0.3
    • 0.5
    • None
    • None

    Description

      Martin Haeger wrote this:

      We're experimenting a bit with Weka and Mahout. Our input data is a
      relation in ARFF format (see attached data.training.arff), and we'd
      like to classify it using Mahout. However, it seems (to us, at first)
      that the Mahout classifier.bayes.interfaces.Algorithm interface is
      centered around documents of text, and not general attribute data.
      Thus, running the classifier causes our ARFF data to be interpreted as
      a document of words, with not very useful results (see attached
      mahout.log).

      With Weka, we're able to get the results we want (see attached weka.log).

      Any suggestions for how to get this working?

      Attachments

        1. weka.log
          2 kB
          Ted Dunning
        2. run.sh
          1.0 kB
          Martin Häger
        3. mahout.log
          25 kB
          Ted Dunning
        4. data.training.arff
          8 kB
          Martin Häger
        5. data.arff
          7 kB
          Martin Häger

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            tdunning Ted Dunning
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment