Tika
  1. Tika
  2. TIKA-815

Tika parsers should handle failures more gracefully

    Details

    • Type: Test Test
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Duplicate
    • Affects Version/s: 1.0
    • Fix Version/s: None
    • Component/s: parser
    • Labels:
      None

      Description

      We encountered an OOM while parsing a Word document. We will report the failure to POI.

      This raises the question about the general robustness of the parsers.

      We've written a little test tool that reproduces the aforementionned OOM and other potential issues that will be reported to the individual parsers. It's the responsibility of the parsers to handle those failures gracefully.

      Yet it's easy to write generic tools at the Tika level to make these kind of tests.

      So we also submit this issue here to start a discussion on what role should Tika have when it comes to validate its parsers.

      Code here: https://github.com/lacostej/tika-hardener

        Activity

        Jerome Lacoste created issue -
        Jukka Zitting made changes -
        Field Original Value New Value
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Duplicate [ 3 ]

          People

          • Assignee:
            Unassigned
            Reporter:
            Jerome Lacoste
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development