Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-5483

Replace methods using an InputStream from Loader.loadPDF

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 3.0.0 PDFBox
    • 3.0.0 PDFBox
    • Parsing
    • None

    Description

      As discussed on dev@pdfbox

      We have to remove the loadPDF variants using InputStream and replace them with RandomAccessRead.

      If it comes to InputStreams users have to decide how to procide:

      • copy the InputStream to memory by using RandomAccessReadBuffer
      • copy the InputStream to a file and use RandomAccessReadBufferedFile or RandomAccessReadMemoryMappedFile

      This would make it more transparent what happens under the hood when using the different kinds of loadPDF methods:

      • a byte array as source is already in memory and the obvious choice is to use RandomAccessReadBuffer as a wrapper
      • a file as source targets a local file and the most obvious choice is to use RandomAccessReadBufferedFile as a wrapper. We should document that as the other alternative RandomAccessReadMemoryMappedFile is offered in this case
      • RandomAccessRead as source is the most obvious one and the user decides how to create it. Additionally is ist possible to implement some own caching loading and/or mechanism

      see PDFBOX-5462 and High memory usage with pdfbox 3 as well

      Attachments

        Activity

          People

            lehmi Andreas Lehmkühler
            lehmi Andreas Lehmkühler
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: