PDFBox
  1. PDFBox
  2. PDFBOX-948

Don't use temporty files by default for all PDF sizes

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.4.0
    • Fix Version/s: 1.5.0
    • Component/s: None
    • Labels:
      None

      Description

      PDFBOX uses by default temporary files as work space (regardless of the PDF size).

      org.apache.pdfbox.io.RandomAcessFile is not buffered, so each read/write acess is a system call. There are functions like readlong, which call read 4 times to read 4 bytes. Additionally, it adds the usual problems with tempory files.

      For normal sized PDFs files, the in-memory implementation RandomAccessBuffer should not increase the memory usage too much, while providing faster IO as all access operations are only memory copies.

      Therefore, please consider switching the default to in-memory scratch buffers. Users with very large files can still pass a temporary directory.

        Issue Links

          Activity

          Hide
          Andreas Lehmkühler added a comment -

          I added the patch in revision 1072678 as proposed.

          Thanks for the contribution!

          Show
          Andreas Lehmkühler added a comment - I added the patch in revision 1072678 as proposed. Thanks for the contribution!

            People

            • Assignee:
              Andreas Lehmkühler
              Reporter:
              Martin Koegler
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development