Uploaded image for project: 'Mahout'
  1. Mahout
  2. MAHOUT-1525

train/validateAdaptiveLogistic

    XMLWordPrintableJSON

Details

    Description

      Hi,
      I tried to use train- and validateAdaptiveLogistic on my data which is like:
      category, id, var1, var2, ...var72 (all numeric)

      I used the following settings:
      mahout trainAdaptiveLogistic --input resource/trainingData \
      --output ./model \
      --target category --categories 9 \
      --predictors a0 a1 a2 a3 a4 a5 a6 a7 a8 a9 a10 a11 .....
      --types numeric \
      --passes 100 \
      --showperf \

      mahout validateAdaptiveLogistic --input resource/testData --model model --confusion --defaultCategory none

      The output of validateAdaptiveLogistic is:
      Log-likelihood:Min=-5.54, Max=-0.04, Mean=-1.58, Median=-1.33

      =======================================================
      Confusion Matrix
      -------------------------------------------------------
      a b d e f g h i <--Classified as
      14 0 0 0 0 0 0 0 | 14 a = projekt
      0 18 0 0 0 0 0 0 | 18 b = news/aktuelles/presse
      0 0 24 0 0 0 0 0 | 24 d = lehrveranstaltung
      0 0 0 19 0 0 0 0 | 19 e = publikation
      0 0 0 0 20 0 0 0 | 20 f = event
      0 0 0 0 0 14 0 0 | 14 g = mitarbeiter/person
      0 0 0 0 0 0 44 0 | 44 h = übersicht
      0 0 0 0 0 0 0 13 | 13 i = institut

      (in case you were wondering, the categories a in german)

      My problem is that this is impossible. I always get a perfect classification even with just a little amount of training data. It doesnt even matter how many features I use I tried it with all 72 and with only one. Am I missing something?

      Regards,
      Richard

      Attachments

        Activity

          People

            Unassigned Unassigned
            Pilgrim Richard Scharrer
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: