Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-2841

Improve robustness of parsers of zip-based files on truncated files

    XMLWordPrintableJSON

    Details

    • Type: Task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.0.0, 1.21
    • Component/s: None
    • Labels:
      None

      Description

      We've done some work on this with docx, etc, but we can do more with epub and open office, and, frankly msoffice as well.  We should also improve the ContainerDetector to work more robustly with truncated zips.

        Attachments

        1. truncated_10000.zip
          10 kB
          Tim Allison
        2. truncated_30000.zip
          29 kB
          Tim Allison

          Activity

            People

            • Assignee:
              tallison Tim Allison
              Reporter:
              tallison Tim Allison
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: