Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-953

PDFBox fails to ExtractText from Adobe Acrobat X 256-bit AES encrypted documents

    Details

    • Type: New Feature
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.3.1, 1.4.0
    • Fix Version/s: 2.0.0
    • Component/s: None
    • Labels:
      None
    • Environment:
      Java: jdk1.6.0_20
      OS: Windows 7, RHEL 5.5

      Description

      From the command line version of PDFBox, this exception is printed out:

      ExtractText failed with the following exception:
      java.lang.ArrayIndexOutOfBoundsException
      at java.lang.System.arraycopy(Native Method)
      at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.computeEncryptedKey(StandardSecurityHandler.java:571)
      at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.computeUserPassword(StandardSecurityHandler.java:608)
      at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.isUserPassword(StandardSecurityHandler.java:792)
      at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.decryptDocument(StandardSecurityHandler.java:189)
      at org.apache.pdfbox.pdmodel.PDDocument.openProtection(PDDocument.java:1091)
      at org.apache.pdfbox.ExtractText.main(ExtractText.java:190)
      at org.apache.pdfbox.PDFBox.main(PDFBox.java:42)

      The document I was using was encrypted using Adobe Acrobat X Pro and has only Page Extraction disabled inside of it. It was encrypted only with a permissions password.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                lehmi Andreas Lehmkühler
                Reporter:
                peter.nordquist@pnl.gov Peter Nordquist
              • Votes:
                1 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: