[TIKA-1857] Enhance PDFParser to extract text from XFA forms - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 1.13
Component/s: parser
Labels:
- patch

Flags:

Patch

Description

Extract text from PDF Forms (XFA). Information about XFA: https://en.wikipedia.org/wiki/XFA

Attachments

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

041617_filled_out.pdf
16/Feb/16 19:39
815 kB
Tim Allison
doc8.pdf
14/Feb/17 14:02
109 kB
Kenneth Lui
govdocs1_xfas.zip
26/Feb/16 01:49
8.26 MB
Tim Allison
xfa_in_govdocs1.txt
16/Feb/16 18:17
3 kB
Tim Allison

Issue Links

relates to

TIKA-1607 Introduce new arbitrary object key/values data structure for persistence of Tika Metadata

Open

Activity

People

Assignee:: Unassigned

Reporter:: Pascal Essiembre

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Dates

Created:: 16/Feb/16 03:55

Updated:: 28/Feb/17 15:11

Resolved:: 02/Mar/16 02:25