As described in
TIKA-2360 we should refactor the ObjectRecogniser interface. I propose creating:
1. TextRecogniser (per Thamme Gowda it takes INPUT:text input and OUTPUT:set of metadata key values)
2. ObjectRecogniser (also per Thamme ObjectRecogniser, VideoLabeller, OCR, Caption - INPUT:raw bytes and OUTPUT:set of metadata key values.)
We should of course rectify this with Tika-DL and how that folds in.