Details
-
New Feature
-
Status: Closed
-
Major
-
Resolution: Not A Problem
-
0.8
-
None
-
None
Description
Sometimes confusion matrix is to big and not really necessary.
And there is another case for the possibility:
If you split a dataset with many labels with random selection percent to testdataset and trainingdataset, it could happen, that there are classes/labels in testdata, which do not appear in the trainingdataset. By creating a model with the trainingdata the created labelindex does not include some labels from testdata. Therefore if you test on this model with the testdata, mahout tries to create a confusion matrix with the labels from testdata which are not included in the labelindex and throws an exception.