[TIKA-3844] Improve extraction of PDF subset info - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Minor
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 2.5.0
Component/s: None
Labels:
None

Description

We're extracting PDFA part and conformance. We should add extraction for VT, UA, and X.

We should also finally get rid of the bad hack from 1.x that appended the pdfa conformance to the file type.

I'd like to thank Peter Wyatt via offline chat for everything that was right about this improvement. The other stuff is all mine.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Tim Allison

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 31/Aug/22 16:22

Updated:: 14/Sep/22 18:26

Resolved:: 31/Aug/22 17:29