It would be cool to be able to run Tika as a network service that accepts a binary document as input and produces the extracted content (as XHTML, text, or just metadata) as output. A bit like
TIKA-169, but without the dependency to a servlet container.
I'd like to be able to set up and run such a server like this:
$ java -jar tika-app.jar --port 1234
We should also add a NetworkParser class that acts as a local client for such a service. This way a lightweight client could use the full set of Tika parsing functionality even with just the tika-core jar within its classpath.