[PDFBOX-1792] Different metadata with NonSequentialPDFParser - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Minor
Resolution: Duplicate
Affects Version/s: 1.8.8, 2.0.0
Fix Version/s: None
Component/s: Parsing, XmpBox
Labels:
None

Description

The traditional parser is able to extract metadata from a test document from ~~TIKA-738~~. The NonSequentialPDFParser is not able to extract metadata from that file. Another file from the Tika test suite has metadata that can be extracted by the NonSequentialPDFParser but not by classic.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

testPDF_acroForm2.pdf
10/Dec/13 04:07
478 kB
Tim Allison
PDFBOX-1792.tar.gz
03/Dec/13 15:00
19 kB
Tim Allison

Issue Links

duplicates

PDFBOX-1806 Metadata not completely extracted by traditional parser, but is extracted by NonSequentialParser

Closed

PDFBOX-5128 Support parsing non standardized XMP

Closed

is depended upon by

TIKA-1203 Some metadata not extracted from PDF files when NonSequentialPDFParser is used

Closed

Activity

People

Assignee:: Andreas Lehmkühler

Reporter:: Tim Allison

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 03/Dec/13 14:59

Updated:: 30/Aug/23 18:07

Resolved:: 30/Aug/23 18:07