Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-2240

MS Write File

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Trivial
    • Resolution: Fixed
    • None
    • 1.15, 2.0.0
    • None
    • None

    Description

      We're currently identifying MS Write Files by suffix ".wri" in one place in our mime defs, but we're also using MS Write File's magic 0x31be0000 to identify the file as an MSWord (doc) file in a different definition.

      In govdocs1, there are a handful of .wri files with suffix .doc. We're getting an Invalid Header exception for these files.

      I think it would be better to move their magic out of our .doc definition to the .wri definition and use the EmptyParser.

      Any objections?

      Attachments

        1. 746255.doc
          3 kB
          Tim Allison

        Activity

          People

            Unassigned Unassigned
            tallison Tim Allison
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: