[TIKA-737] Use (Incubating) ODFToolkit to improve ODF file format processing - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: 0.10
Fix Version/s: None
Component/s: parser
Labels:
None

Description

Currently, we have our own ODF Parser code, which is based on SAX parsing of the content and meta parts. It covers all the common parts, but is by no means complete

The ODF Toolkit project has recently joined the Apache Incubator, and is working towards its first release. Once there's an incubating version, we should re-write the parser to delegate most of the work to ODF Toolkit.

Attachments

Issue Links

blocks

TIKA-736 OpenOffice parser: master footer text isn't extracted

Resolved

Activity

People

Assignee:: Unassigned

Reporter:: Nick Burch

Votes:: 2 Vote for this issue

Watchers:: 0 Start watching this issue

Dates

Created:: 01/Oct/11 14:55

Updated:: 06/Jan/12 04:36