Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-5175

Behaviour change in 2.0.20 due to use of IOUtils.populateBuffer in SecurityHandler.prepareAESInitializationVector leading to IOException for certain PDF

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 2.0.20, 2.0.21, 2.0.23
    • Fix Version/s: 2.0.24, 3.0.0 PDFBox
    • Component/s: Parsing
    • Labels:
      None

      Description

      We have a PDF file which we cannot share which has a cerification signature from Adobe.

      Prior version 2.0.20 this PDF could be loaded and analyzed. From version 2.0.20 and newer calling the load method from PDDocument results in a IOException.

      We tracked down the problem why this was not happening in 2.0.19 and found out the populateBuffer method changes the behaviour of the prepareAESInitializationVector method.

      Before the code looked like this:

              if (decrypt)
              {
                  // read IV from stream
                  int ivSize = data.read(iv);
                  if (ivSize == -1)
                  {
                      return false;
                  }
                  if (ivSize != iv.length)
                  {
                      throw new IOException(
                              "AES initialization vector not fully read: only "
                                      + ivSize + " bytes read instead of " + iv.length);
                  }
      
      

      if data was empty -1 was returned from the read call and the method returned false and everything went on okay. 2.0.20 introduced changes this line to:

                  int ivSize = (int) IOUtils.populateBuffer(data, iv);
                  if (ivSize == -1) { 
                      return false; 
                  }
      

      Due to the if condition being still there we are not quite sure if this was intentional.

      populateBuffer will never return -1 but anything >= 0.

      So either this is unintentionally than this is a bug and the if clause should check for 0 bytes read. Or this is intentional and the if clause is obsolete as well as the boolean return value.

      Here is a stacktrace (no line numbers, sorry):

      Caused by: java.io.IOException: AES initialization vector not fully read: only 0 bytes read instead of 16
      	at org.apache.pdfbox@2.0.23/org.apache.pdfbox.pdmodel.encryption.SecurityHandler.prepareAESInitializationVector(Unknown Source)
      	at org.apache.pdfbox@2.0.23/org.apache.pdfbox.pdmodel.encryption.SecurityHandler.encryptDataAESother(Unknown Source)
      	at org.apache.pdfbox@2.0.23/org.apache.pdfbox.pdmodel.encryption.SecurityHandler.encryptData(Unknown Source)
      	at org.apache.pdfbox@2.0.23/org.apache.pdfbox.pdmodel.encryption.SecurityHandler.decryptStream(Unknown Source)
      	at org.apache.pdfbox@2.0.23/org.apache.pdfbox.pdfparser.COSParser.parseFileObject(Unknown Source)
      	at org.apache.pdfbox@2.0.23/org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(Unknown Source)
      	at org.apache.pdfbox@2.0.23/org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(Unknown Source)
      	at org.apache.pdfbox@2.0.23/org.apache.pdfbox.pdfparser.COSParser.parseDictObjects(Unknown Source)
      	at org.apache.pdfbox@2.0.23/org.apache.pdfbox.pdfparser.PDFParser.initialParse(Unknown Source)
      	at org.apache.pdfbox@2.0.23/org.apache.pdfbox.pdfparser.PDFParser.parse(Unknown Source)
      	at org.apache.pdfbox@2.0.23/org.apache.pdfbox.pdmodel.PDDocument.load(Unknown Source)
      

        Attachments

          Activity

            People

            • Assignee:
              lehmi Andreas Lehmkühler
              Reporter:
              sfieber Sebastian Fieber
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: