Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-4097

Compressed object will lost when brute force search failed to handle compressed streams

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.0.8
    • Fix Version/s: 2.0.10, 3.0.0 PDFBox
    • Component/s: Parsing
    • Labels:
      None

      Description

      Compressed object described in cross-reference streams will lost when brute force search failed to handle such streams.

      The attached PDF has an object 1336, but it had a offset that referenced to object 1828. The inconsistency led to a brute force search. (Introduced by COSParser.checkXrefOffsets)

      During the search (in bfSearchForObjStreams), Object stream 1828, 1829, 1830 failed to decompress due to "corrupted" stream(yes, the Params field was missing in the dictionary or the Filter was wrong). Thus, 462 compressed objects described in cross-reference streams are lost. Since important objects (the Root, the Pages, etc.) referred to objects in 1828 or something, all resolved to null (because the corrected XRefOffsets doens't have them). Further parsing is impossible.

      However, when I tried to bypass checkXrefOffsets, the PDF shows correctly without any (noticeable) error. It seemed that object 1336 is not used in the PDF.

      "Corrupted" 1828:

      1828 0 obj
      <<
      /Length 2176
      /Type /ObjStm
      /N 200
      /First 2103
      /Filter /FlatDecode
      >>
      ...

      It doesn't work well in bfSearchForObjStreams but works in parseObjectStream.

       

      Would it be nice to have a fallback to preserve compressed stream object key offsets, when we some error in brute force search?

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                lehmi Andreas Lehmkühler
                Reporter:
                hust.zcheng Cheng Zhong
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: