Details
-
Question
-
Status: Closed
-
Major
-
Resolution: Fixed
-
0.7, 0.8, 0.9
-
None
Description
Hi,
I tried to use train- and validateAdaptiveLogistic on my data which is like:
category, id, var1, var2, ...var72 (all numeric)
I used the following settings:
mahout trainAdaptiveLogistic --input resource/trainingData \
--output ./model \
--target category --categories 9 \
--predictors a0 a1 a2 a3 a4 a5 a6 a7 a8 a9 a10 a11 .....
--types numeric \
--passes 100 \
--showperf \
mahout validateAdaptiveLogistic --input resource/testData --model model --confusion --defaultCategory none
The output of validateAdaptiveLogistic is:
Log-likelihood:Min=-5.54, Max=-0.04, Mean=-1.58, Median=-1.33
=======================================================
Confusion Matrix
-------------------------------------------------------
a b d e f g h i <--Classified as
14 0 0 0 0 0 0 0 | 14 a = projekt
0 18 0 0 0 0 0 0 | 18 b = news/aktuelles/presse
0 0 24 0 0 0 0 0 | 24 d = lehrveranstaltung
0 0 0 19 0 0 0 0 | 19 e = publikation
0 0 0 0 20 0 0 0 | 20 f = event
0 0 0 0 0 14 0 0 | 14 g = mitarbeiter/person
0 0 0 0 0 0 44 0 | 44 h = übersicht
0 0 0 0 0 0 0 13 | 13 i = institut
(in case you were wondering, the categories a in german)
My problem is that this is impossible. I always get a perfect classification even with just a little amount of training data. It doesnt even matter how many features I use I tried it with all 72 and with only one. Am I missing something?
Regards,
Richard