Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-2841

Improve robustness of parsers of zip-based files on truncated files

    XMLWordPrintableJSON

Details

    • Task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.0.0, 1.21
    • None
    • None

    Description

      We've done some work on this with docx, etc, but we can do more with epub and open office, and, frankly msoffice as well.  We should also improve the ContainerDetector to work more robustly with truncated zips.

      Attachments

        1. truncated_10000.zip
          10 kB
          Tim Allison
        2. truncated_30000.zip
          29 kB
          Tim Allison

        Activity

          People

            tallison Tim Allison
            tallison Tim Allison
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: