[PDFBOX-1557] NonSequentialPDFParser incorrectly parsing document info - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Minor
Resolution: Fixed
Affects Version/s: 1.8.0
Fix Version/s: 1.8.1
Component/s: Parsing
Labels:
None
Environment:
Mac OS X 10.6.8, Eclipse Version: Juno Service Release 2 (Build id: 20130225-0426), Java SE 6 (1.6.0)

Description

When using the NonSequentialPDFParser, the PDDocumentInformation returned by getDocumentInformation() seems to contain all null entries, which does not occur when using the standard PDFParser. I have a large batch of PDF files which have random and strange issues that cause them to occasionally fail with the standard parser, so I was experimenting with the NonSequential parser and came across this issue.

I'll attempt to attach some test code & a test PDF file for which I can replicate the issue.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

aa.pdf
03/Apr/13 21:42
404 kB
Robert Bartlett-Schneider
JIRA-1557.patch
06/Apr/13 18:34
1 kB
Eric Leleu
TestParsers.java
03/Apr/13 21:42
1.0 kB
Robert Bartlett-Schneider

Issue Links

relates to

PDFBOX-1603 Regression in PDDocument.loadNonSeq ?

Closed

Activity

People

Assignee:: Andreas Lehmkühler

Reporter:: Robert Bartlett-Schneider

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 03/Apr/13 21:40

Updated:: 20/May/13 10:14

Resolved:: 07/Apr/13 11:17