Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-5175

Behaviour change in 2.0.20 due to use of IOUtils.populateBuffer in SecurityHandler.prepareAESInitializationVector leading to IOException for certain PDF



    • Bug
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 2.0.20, 2.0.21, 2.0.23
    • 2.0.24, 3.0.0 PDFBox
    • Parsing
    • None


      We have a PDF file which we cannot share which has a cerification signature from Adobe.

      Prior version 2.0.20 this PDF could be loaded and analyzed. From version 2.0.20 and newer calling the load method from PDDocument results in a IOException.

      We tracked down the problem why this was not happening in 2.0.19 and found out the populateBuffer method changes the behaviour of the prepareAESInitializationVector method.

      Before the code looked like this:

              if (decrypt)
                  // read IV from stream
                  int ivSize = data.read(iv);
                  if (ivSize == -1)
                      return false;
                  if (ivSize != iv.length)
                      throw new IOException(
                              "AES initialization vector not fully read: only "
                                      + ivSize + " bytes read instead of " + iv.length);

      if data was empty -1 was returned from the read call and the method returned false and everything went on okay. 2.0.20 introduced changes this line to:

                  int ivSize = (int) IOUtils.populateBuffer(data, iv);
                  if (ivSize == -1) { 
                      return false; 

      Due to the if condition being still there we are not quite sure if this was intentional.

      populateBuffer will never return -1 but anything >= 0.

      So either this is unintentionally than this is a bug and the if clause should check for 0 bytes read. Or this is intentional and the if clause is obsolete as well as the boolean return value.

      Here is a stacktrace (no line numbers, sorry):

      Caused by: java.io.IOException: AES initialization vector not fully read: only 0 bytes read instead of 16
      	at org.apache.pdfbox@2.0.23/org.apache.pdfbox.pdmodel.encryption.SecurityHandler.prepareAESInitializationVector(Unknown Source)
      	at org.apache.pdfbox@2.0.23/org.apache.pdfbox.pdmodel.encryption.SecurityHandler.encryptDataAESother(Unknown Source)
      	at org.apache.pdfbox@2.0.23/org.apache.pdfbox.pdmodel.encryption.SecurityHandler.encryptData(Unknown Source)
      	at org.apache.pdfbox@2.0.23/org.apache.pdfbox.pdmodel.encryption.SecurityHandler.decryptStream(Unknown Source)
      	at org.apache.pdfbox@2.0.23/org.apache.pdfbox.pdfparser.COSParser.parseFileObject(Unknown Source)
      	at org.apache.pdfbox@2.0.23/org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(Unknown Source)
      	at org.apache.pdfbox@2.0.23/org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(Unknown Source)
      	at org.apache.pdfbox@2.0.23/org.apache.pdfbox.pdfparser.COSParser.parseDictObjects(Unknown Source)
      	at org.apache.pdfbox@2.0.23/org.apache.pdfbox.pdfparser.PDFParser.initialParse(Unknown Source)
      	at org.apache.pdfbox@2.0.23/org.apache.pdfbox.pdfparser.PDFParser.parse(Unknown Source)
      	at org.apache.pdfbox@2.0.23/org.apache.pdfbox.pdmodel.PDDocument.load(Unknown Source)




            lehmi Andreas Lehmkühler
            sfieber Sebastian Fieber
            0 Vote for this issue
            3 Start watching this issue