Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-2187

Align default behavior of experimental docx parser with that of doc parser in handling delText

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • None
    • 1.15, 2.0.0
    • None
    • None

    Description

      Now that we can ignore delText via the experimental alternate SAXParser for .docx files, let's make that the default behavior to align with the expected behavior for our .doc parser (ignore deleted text).

      Let's also add the ability to include deleted text from .doc files.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              tallison Tim Allison
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: