Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-536

missing iterator.hasNext() test in PDFXrefStreamParser

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.8.0-incubator
    • 1.0.0
    • Parsing
    • None

    Description

      The class: org.apache.pdfbox.pdfparser.PDFXrefStreamParser

      uses an unbounded iterator in it's parser method.

      Specifically, line 100 should be changed from:

      while(pdfSource.available() > 0)

      To

      while(pdfSource.available() > 0 && objIter.hasNext())

      Not having this check causes line 115 to blow up with a NoSuchElementException.

      I will attach a test file that triggers the problem (during Text extraction) and also a patched version of PDFXrefStreamParser.java.

      Attachments

        1. PDFXrefStreamParser.java
          6 kB
          Mel Martinez
        2. 09_05_11_Archiv.pdf
          114 kB
          Mel Martinez

        Issue Links

          Activity

            People

              Unassigned Unassigned
              m.martinez Mel Martinez
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: