Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
1.4.0
-
None
-
None
Description
PDFBOX uses by default temporary files as work space (regardless of the PDF size).
org.apache.pdfbox.io.RandomAcessFile is not buffered, so each read/write acess is a system call. There are functions like readlong, which call read 4 times to read 4 bytes. Additionally, it adds the usual problems with tempory files.
For normal sized PDFs files, the in-memory implementation RandomAccessBuffer should not increase the memory usage too much, while providing faster IO as all access operations are only memory copies.
Therefore, please consider switching the default to in-memory scratch buffers. Users with very large files can still pass a temporary directory.
Attachments
Attachments
Issue Links
- depends upon
-
PDFBOX-946 RandomAccessBuffer shoud be created empty
- Closed