In https://issues.apache.org/jira/browse/LUCENE-4345?focusedCommentId=13454729&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13454729 Simon Willnauer pointed out that the current Classifier API doesn't work well in multi threading environments:
The interface you defined has some problems with respect to Multi-Threading IMO. The interface itself suggests that this class is stateful and you have to call methods in a certain order and at the same you need to make sure that it is not published for read access before training is done. I think it would be wise to pass in all needed objects as constructor arguments and make the references final so it can be shared across threads and add an interface that represents the trained model computed offline? In this case it doesn't really matter but in the future it might make sense. We can also skip the model interface entirely and remove the training method until we have some impls that really need to be trained.
I missed that at that point but I think for 6.0 it would be wise to rearrange the API to address that properly.