Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-416

Out-of-process text extraction

    Details

    • Type: New Feature
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.9
    • Component/s: parser
    • Labels:
      None

      Description

      There's currently no easy way to guard against JVM crashes or excessive memory or CPU use caused by parsing very large, broken or intentionally malicious input documents. To better protect against such cases and to generally improve the manageability of resource consumption by Tika it would be great if we had a way to run Tika parsers in separate JVM processes. This could be handled either as a separate "Tika parser daemon" or as an explicitly managed pool of forked JVMs.

        Attachments

          Activity

            People

            • Assignee:
              jukkaz Jukka Zitting
              Reporter:
              jukkaz Jukka Zitting
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: