Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-383

BaseParser incorrectly handling stream, exhibiting IOException

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.7.3
    • 1.7.0
    • Parsing
    • None
    • pdfbox 0.73 with java 5 running on windows platform

    Description

      when loading pdf file containing a file attachment annotation , errors might occurs when 2 conditions arise:

      • the Length value for the dictionary of F stream holds an indirect reference to a integer value
      • the content of the filtered stream contains the word 'endstream'

      typically this occurs when, in the pdf file, there is a stream description as follows:

      12 0 obj
      << /Length 16 0 R
      /Filter /FlateDecode
      >>
      stream

      {content}
      endstream
      endobj
      ...
      16 0 obj {length}
      endobj
      ....

      and it the {content}

      (filtered) contains the (filtered) string "endstream".
      (see on line 3700 of the attachment)

      the problem is related to the way stream content is (always) read by method readUntilEndStream () that stop on first 'endstream' sequence end.

      a (partial) fix was made, that reads the stream content 3 different ways:

      • if the Length is known (this is a direct object), the {length}

        bytes are read and written to the stream FilteredStream

      • if the Length is unknown and if the filter is FlateFilter, the code unfilters the datas (the FlateDecode algorythm allows for not knowing the length of encoded data ahead of time) and associates to the stream's unfiltered stream
      • otherwise, let current behavior

      Running the modified code on files exhibiting errors has fixed problems that was encountered.

      Attachments

        1. BaseParser.java
          52 kB
          Son
        2. fail.pdf
          1.33 MB
          Son

        Activity

          People

            lehmi Andreas Lehmkühler
            lamontagnebleue Son
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: