Description
Hallo.
I use tika server standalone started with the option:
java -jar /opt/tika/tika-server-1.19.1.jar -spawnChild
I use ManifoldCF and Solr to index file using tika server.
It happens that indexing is continuously crashed because I obtain many:
Tika down, retrying: Connection reset
etc.
I suspect that, when a process is restarted, the client crash as mentioned here:
If the child process is in the process of shutting down, and it gets a new request it will return 503 – Service Unavailable. If the server times out on a file, the client will receive an IOException from the closed socket. Note that all other files that are being processed will end with an IOException from a closed socket when the child process shuts down; e.g. if you send three files to tika-server concurrently, and one of them causes a catastrophic problem requiring the child to shut down, you won't be able to tell which file caused the problems. In the future, we may implement a gentler shutdown than we currently have.
as reported here https://wiki.apache.org/tika/TikaJAXRS
How could I workaround it ?
Thanks a lot
Mario
Attachments
Attachments
Issue Links
- is related to
-
CONNECTORS-1561 Upgrade to Tika 1.20 when available
- Resolved
-
CONNECTORS-1560 Improve tika-server robustness via -spawnChild
- Resolved