Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-2187

Align default behavior of experimental docx parser with that of doc parser in handling delText

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.0, 1.15
    • Component/s: None
    • Labels:
      None

      Description

      Now that we can ignore delText via the experimental alternate SAXParser for .docx files, let's make that the default behavior to align with the expected behavior for our .doc parser (ignore deleted text).

      Let's also add the ability to include deleted text from .doc files.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                tallison@apache.org Tim Allison
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: