Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-537

Endless loop in org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionary() on certain corrupt PDF streams

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.8.0-incubator
    • 1.0.0
    • Parsing
    • None
    • The problem occurs on certain corrupt streams. I have attached the file "corrupt-endless-loop-in-0.8.pdf" which is 447 bytes long and exhibits this problem. Not sure, but I think this file was originally longer and was somehow cut.

    Description

      The endless loop seems to have been introduced with the changes from 01-Sep-2009 in svn revision 810122 with the addition of the loop to wait for a valid dictionary

      Index: PDFBox/src/main/java/org/apache/pdfbox/pdfparser/BaseParser.java
      ===================================================================
      — PDFBox/src/main/java/org/apache/pdfbox/pdfparser/BaseParser.java (revision 793364)
      +++ PDFBox/src/main/java/org/apache/pdfbox/pdfparser/BaseParser.java (revision 810122)
      @@ -183,7 +183,23 @@
      if( c == '>')

      { done = true; - }

      + }
      + else
      + if(c != '/')
      + {
      + //an invalid dictionary, we are expecting
      + //the key, read until we can recover
      + logger().warning("Invalid dictionary, found:" + (char)c + " but expected:\''");
      + int read = pdfSource.read();
      + while(read != -1 && read != '/' && read != '>')
      +

      { + read = pdfSource.read(); + }

      + if(read != -1)
      +

      { + pdfSource.unread(read); + }

      + }
      else
      {
      COSName key = parseCOSName();
      @@ -206,9 +222,12 @@

      if( value == null )

      { - throw new IOException("Bad Dictionary Declaration " + pdfSource ); + logger().warning("Bad Dictionary Declaration " + pdfSource ); }
      • obj.setItem( key, value );
        + else
        + { + obj.setItem( key, value ); + }

        }
        }
        char ch = (char)pdfSource.read();

      Attachments

        1. corrupt-endless-loop-in-0.8.pdf
          0.4 kB
          Hacho
        2. pdfbox-537-proposed-fix.zip
          18 kB
          Hacho
        3. PDFParser.zip
          14 kB
          Jignesh Sh
        4. TestPDFBOX537.java
          1.0 kB
          Hacho

        Activity

          People

            Unassigned Unassigned
            ih Hacho
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 1h
                1h
                Remaining:
                Remaining Estimate - 1h
                1h
                Logged:
                Time Spent - Not Specified
                Not Specified