Details
-
Task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
If a file is passed to Tika wrapped as a TikaInputStream with an underlying file, the DefaultZipDetector tries to open a ZipFile. If there's a truncated file or if that ZipFile open fails, the DefaultZipDetector effectively gives up.
Given that there's still a file available, we should try to do a streaming detect by reopening the file as a regular InputStream.
If we don't do this, we wind up getting different detection for some truncated ooxml if the user sends in a file vs a stream.