[TIKA-416] Out-of-process text extraction - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Closed
Priority: Minor
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 0.9
Component/s: parser
Labels:
None

Description

There's currently no easy way to guard against JVM crashes or excessive memory or CPU use caused by parsing very large, broken or intentionally malicious input documents. To better protect against such cases and to generally improve the manageability of resource consumption by Tika it would be great if we had a way to run Tika parsers in separate JVM processes. This could be handled either as a separate "Tika parser daemon" or as an explicitly managed pool of forked JVMs.

Attachments

Activity

People

Assignee:: Jukka Zitting

Reporter:: Jukka Zitting

Votes:: 0 Vote for this issue

Watchers:: 0 Start watching this issue

Dates

Created:: 30/Apr/10 14:38

Updated:: 02/Aug/12 09:33

Resolved:: 18/Jan/11 15:34