Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Won't Fix
-
1.0
-
None
-
None
Description
The detection of ZIP bombs is nice and the original issue says it's configureable, but I found no solution how to change ParseContext of the AutoDetectParser to e.g. allow deeper nesting levels. The SecureContentHandler instantiation is hardcoded and there is no point of intervention.
In my case a simple ZIP of an Eclipse project: http://store.pangaea.de/Publications/AltaweelM_2011/Salinization.zip triggered the bomb detection, but it is of course no bomb. Its just because the JAR/WAR files in this projects itself contain other JAR files and class files This overflows the nesting level of 10 - maybe even the TIKA OSGI bundle triggers the bomb detection (not tested).
In my case I would like to raise the nesting level, but there is no solution. My change was to simply filter away JAR files (as they contain no metadata we are interested in our own development, we already removed e.g. CLASS file parsers from out TIKA config so we have a very simple parser structure only allowing pdf, office documents, txt files,...) by using a custom DocumentSelector in my ParseContext.
Attachments
Issue Links
- links to