Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-2330

Prevent preventable OOM in CompressorInputStream

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.15
    • Component/s: None
    • Labels:
      None

      Description

      On TIKA-1631, users noted that merely detecting an x-compress file could cause an OOM because we were instantiating the stream as part of detection.
      On COMPRESS-382, Luis Filipe Nassif noted that something similar happens with LZMA.

      Let's work with the Compress project to:
      1) add a static detect that doesn't instantiate the streams (COMPRESS-385)
      2) allow a parameterizable limit on the amount of allocated space for x-compress (COMPRESS-386) and LZMA (COMPRESS-382)

      Until we have a chance to make these changes in the compress project, let's temporarily copy/paste/update from Compress to fix these within Tika.

        Issue Links

        There are no Sub-Tasks for this issue.

          Activity

          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Jenkins build Tika-trunk #1242 (See https://builds.apache.org/job/Tika-trunk/1242/)
          TIKA-2330 – prevent preventable ooms in both detecting and parsing (tallison: https://github.com/apache/tika/commit/75eea6e5502f4f5a2edf5ab459b4c369d33f66e5)

          • (edit) tika-parsers/src/main/java/org/apache/tika/parser/pkg/ZipContainerDetector.java
          • (edit) tika-parsers/src/test/java/org/apache/tika/parser/pkg/CompressParserTest.java
          • (add) tika-core/src/main/java/org/apache/tika/exception/TikaMemoryLimitException.java
          • (edit) tika-parsers/src/test/java/org/apache/tika/detect/TestContainerAwareDetector.java
          • (edit) tika-parent/pom.xml
          • (add) tika-parsers/src/test/resources/test-documents/testLZMA_oom
          • (edit) tika-parsers/pom.xml
          • (edit) tika-parsers/src/main/java/org/apache/tika/parser/pkg/CompressorParser.java
          • (add) tika-parsers/src/test/resources/test-documents/testZ_oom.Z
          • (add) tika-parsers/src/main/java/org/apache/tika/parser/pkg/TikaCompressorStreamFactory.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Jenkins build Tika-trunk #1242 (See https://builds.apache.org/job/Tika-trunk/1242/ ) TIKA-2330 – prevent preventable ooms in both detecting and parsing (tallison: https://github.com/apache/tika/commit/75eea6e5502f4f5a2edf5ab459b4c369d33f66e5 ) (edit) tika-parsers/src/main/java/org/apache/tika/parser/pkg/ZipContainerDetector.java (edit) tika-parsers/src/test/java/org/apache/tika/parser/pkg/CompressParserTest.java (add) tika-core/src/main/java/org/apache/tika/exception/TikaMemoryLimitException.java (edit) tika-parsers/src/test/java/org/apache/tika/detect/TestContainerAwareDetector.java (edit) tika-parent/pom.xml (add) tika-parsers/src/test/resources/test-documents/testLZMA_oom (edit) tika-parsers/pom.xml (edit) tika-parsers/src/main/java/org/apache/tika/parser/pkg/CompressorParser.java (add) tika-parsers/src/test/resources/test-documents/testZ_oom.Z (add) tika-parsers/src/main/java/org/apache/tika/parser/pkg/TikaCompressorStreamFactory.java
          Hide
          tallison@mitre.org Tim Allison added a comment -

          Didn't touch 2.0 because we should have an updated Commons Compress by then.

          Show
          tallison@mitre.org Tim Allison added a comment - Didn't touch 2.0 because we should have an updated Commons Compress by then.

            People

            • Assignee:
              Unassigned
              Reporter:
              tallison@mitre.org Tim Allison
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development