Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-4297

Allow to space efficiently analyse large PDFs

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Parsing
    • None

    Description

      Assume you get a 300+MB large pdf and need to know

      1) the file names of embedded files if any

      2) whether it is encrypted (symmetric or asymmetric)

      3) certification level (and whether it is signed)

      This should not use more than 5 MB (extra) memory

       

      P.S.: seems to an exampe of https://pdfbox.apache.org/ideas.html  "Handle large PDF files"

       

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              hauser@acm.org Ralf Hauser
              Votes:
              1 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated: