Details
-
Task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
Tika-eval's language stats are tightly coupled to the application and the initial workflow of running against a directory of extracts and reporting info to an H2 db.
It would be helpful for large-scale data processing pipelines to modularize some of tika-eval's stats so that they can be applied to, e.g. a full Solr/ES cluster. We won't build the actual connectors to Solr/ES/other on this ticket, but we will make it easier for integrators to build their own.
This is slated for 1.23/2.0...not 1.22.