Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-2725

Make tika-server robust against ooms/infinite loops/memory leaks

    Details

    • Type: Task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.19, 2.0.0
    • Component/s: None
    • Labels:
      None

      Description

      Currently, tika-server is vulnerable to ooms, inifinite loops and memory leaks. I see two ways of making it robust:

      1) use the ForkParser
      2) have tika-server spawn a child process that actually runs the server, put a watcher thread in the child that will kill the child on oom/timeout/after x files. The parent process can then restart the child if it dies.

      I somewhat prefer 2) so that we don't have to doubly pass the inputstream. I propose 2), and I propose making it optional in Tika 1.x, but then the default in Tika 2.x. We could also add a status ping from parent to child in case the child gets caught up in stop the world gc (h/t Boaz Leskes).

      Other options/recommendations?

        Attachments

          Activity

            People

            • Assignee:
              tallison@apache.org Tim Allison
              Reporter:
              tallison@apache.org Tim Allison
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: