Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-4541

Incorrect? handling of direct/indirect objects

    XMLWordPrintableJSON

Details

    • Wish
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.0.14
    • None
    • Parsing, Writing
    • None
    • Patch

    Description

      We ran into some issues concerning blank pages in some of our resulting PDF documents. Investigation showed that some objects which were referenced were never actually written. We then noticed that these objects were never written because they missed the `isDirect` flag. We were able to mitigate this issue by adding

      if (retval != null) {
          retval.setDirect(true);
      }
      return retval;
      

      at the end of `BaseParser.parseDirObject()`.

      While the pdfs were now displayed correctly, QPDFs check reported erroneous hint tables. The offsets there were calculated incorrectly because the objects were now written not only once, but, in fact, several times in places where they should have been merely referenced. We eventually resolved this issue by replacing the if-condiction

      if (willEncrypt || incrementalUpdate || subValue instanceof COSDictionary || subValue == null)
      

      in `COSWriter.visitFromArray(COSArray)` and `COSWriter.visitFromDictionay(COSDictionary)` with

      if (willEncrypt || incrementalUpdate || subValue == null || !(subValue instanceof COSObject))
      

      Attachments

        1. linearized_withfix.pdf
          48 kB
          Jonathan
        2. linearized.pdf
          48 kB
          Jonathan
        3. broken_censored.pdf
          51 kB
          Jonathan

        Activity

          People

            Unassigned Unassigned
            Rahn2 Jonathan
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: