Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-953

PDFBox fails to ExtractText from Adobe Acrobat X 256-bit AES encrypted documents

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.3.1, 1.4.0
    • 2.0.0
    • None
    • None
    • Java: jdk1.6.0_20
      OS: Windows 7, RHEL 5.5

    Description

      From the command line version of PDFBox, this exception is printed out:

      ExtractText failed with the following exception:
      java.lang.ArrayIndexOutOfBoundsException
      at java.lang.System.arraycopy(Native Method)
      at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.computeEncryptedKey(StandardSecurityHandler.java:571)
      at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.computeUserPassword(StandardSecurityHandler.java:608)
      at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.isUserPassword(StandardSecurityHandler.java:792)
      at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.decryptDocument(StandardSecurityHandler.java:189)
      at org.apache.pdfbox.pdmodel.PDDocument.openProtection(PDDocument.java:1091)
      at org.apache.pdfbox.ExtractText.main(ExtractText.java:190)
      at org.apache.pdfbox.PDFBox.main(PDFBox.java:42)

      The document I was using was encrypted using Adobe Acrobat X Pro and has only Page Extraction disabled inside of it. It was encrypted only with a permissions password.

      Attachments

        1. lorem-ipsum-256AES.pdf
          32 kB
          Peter Nordquist

        Issue Links

          Activity

            People

              lehmi Andreas Lehmkühler
              peter.nordquist@pnl.gov Peter Nordquist
              Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: