[PDFBOX-4297] Allow to space efficiently analyse large PDFs - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: Parsing
Labels:
None

Description

Assume you get a 300+MB large pdf and need to know

1) the file names of embedded files if any

2) whether it is encrypted (symmetric or asymmetric)

3) certification level (and whether it is signed)

This should not use more than 5 MB (extra) memory

P.S.: seems to an exampe of https://pdfbox.apache.org/ideas.html "Handle large PDF files"

Attachments

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

programWinter2015_20210103_091853-sig_LTV.pdf
03/Jan/21 08:55
35.10 MB
Ralf Hauser

Issue Links

is related to

PDFBOX-4569 Implement an ondemand Parser

Closed

Activity

People

Assignee:: Unassigned

Reporter:: Ralf Hauser

Votes:: 1 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 22/Aug/18 05:18

Updated:: 03/Jan/21 13:01