[PDFBOX-4215] Get pages from a HTTP stream of a large pdf file - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Wish
Status: Closed
Priority: Minor
Resolution: Won't Do
Affects Version/s: 2.0.9
Fix Version/s: None
Component/s: Parsing
Labels:
None

Description

Hi Apache contributors,

Suppose I have a very big pdf file and I want to split this file into file chunks (e.g. one file per page). I cannot load the entire file into memory and I cannot use the hard disk of the computer as described in the doc for large files... . But I still have the stream of the file, line by line.

I read that it is not feasible to get the pages of the pdf in order (because of the pdf specs), but is it feasible to load random pages if you read line by line and look for page breaks in pdfbox?

Hagd, A.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Alexandre

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 09/May/18 16:18

Updated:: 09/May/18 19:06

Resolved:: 09/May/18 17:58