Description
Now that we can ignore delText via the experimental alternate SAXParser for .docx files, let's make that the default behavior to align with the expected behavior for our .doc parser (ignore deleted text).
Let's also add the ability to include deleted text from .doc files.
Attachments
Issue Links
- is related to
-
TIKA-207 MS word doc containing tracked changes produces incorrect text
- Closed