Description
Currently, Tika does not read text from Cover Pages and Tables Of Content of DOCX documents. Examples of documents are attached.
To process documents, I used the standalone Tika-App utility, tika-app-1.5.jar. I tried both specifying files to be processed in the command line and selecting them from the utility menu.
Attachments
Attachments
Issue Links
- depends upon
-
TIKA-1380 Upgrade to Apache POI 3.11 beta 1
- Resolved