Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-948

Don't use temporty files by default for all PDF sizes

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.4.0
    • 1.5.0
    • None
    • None

    Description

      PDFBOX uses by default temporary files as work space (regardless of the PDF size).

      org.apache.pdfbox.io.RandomAcessFile is not buffered, so each read/write acess is a system call. There are functions like readlong, which call read 4 times to read 4 bytes. Additionally, it adds the usual problems with tempory files.

      For normal sized PDFs files, the in-memory implementation RandomAccessBuffer should not increase the memory usage too much, while providing faster IO as all access operations are only memory copies.

      Therefore, please consider switching the default to in-memory scratch buffers. Users with very large files can still pass a temporary directory.

      Attachments

        1. default-inmemory-workfile.patch
          1 kB
          Martin Koegler

        Issue Links

          Activity

            People

              lehmi Andreas Lehmkühler
              e9925248 Martin Koegler
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: