So why good the normalized and normalizedList functions?
First of all, why normalized?
When I first tried to use the Lucene Classification, one of the bigger problem was, that the scores, whats come back means nothing. Basically the classifier returns the class, and a random number. If you have 2 text, and you push them in the classifier, the scores didn't help you to figure out what result is more trustworthy.
The normalized values have that option. If you want to tell the user how sure are you, the normalized values help you out.
Second, why lists?
If you can tell the user, how sure are you, it's not far that you want to tell them whats are the other options. What are the 3 more relevant or 5 more relevant class.
Most of the classification algorithms have those numbers a prior.
The problem with the normalization and the lists:
Sadly not all classification algorithm have lists, they just drop classes. So it can't go instantly to the api, because some classification method never have list or score.
I have 2 api suggestion:
The first where the Classifier interface get those normalized and normalizedList functions, and some of the implementations drop exceptions if somebody want to use them.
Or, the Classifier interface don't get them, but some classifier can provide these functions.