Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
0.10
-
None
-
None
Description
Currently, we have our own ODF Parser code, which is based on SAX parsing of the content and meta parts. It covers all the common parts, but is by no means complete
The ODF Toolkit project has recently joined the Apache Incubator, and is working towards its first release. Once there's an incubating version, we should re-write the parser to delegate most of the work to ODF Toolkit.
Attachments
Issue Links
- blocks
-
TIKA-736 OpenOffice parser: master footer text isn't extracted
- Resolved