Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-209

java.lang.OutOfMemoryError while parsing pdf file

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Minor
    • Resolution: Cannot Reproduce
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Parsing
    • Labels:
      None

      Description

      [imported from SourceForge]
      http://sourceforge.net/tracker/index.php?group_id=78314&atid=552832&aid=1581061
      Originally submitted by hui85 on 2006-10-19 23:47.

      I want to parse text from a PDF file and use
      PDFTextStripper. Most of the PDF files work. But in the
      following case I get an OutOfMemoryError.
      The PDF file I want to parse is about 312k and my JVM
      Xmx is about 512m.

      I get the following stackTrace:

      java.lang.OutOfMemoryError
      at java.util.zip.Inflater.inflateBytes(Native Method)
      at java.util.zip.Inflater.inflate(Unknown Source)
      at java.util.zip.InflaterInputStream.read(Unknown Source)
      at
      org.pdfbox.filter.FlateFilter.decode(FlateFilter.java:97)
      at org.pdfbox.cos.COSStream.doDecode(COSStream.java:319)
      at org.pdfbox.cos.COSStream.doDecode(COSStream.java:249)
      at
      org.pdfbox.cos.COSStream.getUnfilteredStream(COSStream.java:173)
      at
      org.pdfbox.pdfparser.PDFStreamParser.<init>(PDFStreamParser.java:91)
      at
      org.pdfbox.cos.COSStream.getStreamTokens(COSStream.java:135)
      at
      org.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:189)
      at
      org.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:160)
      at
      org.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:355)
      at
      org.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:268)
      at
      org.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:220)

      I use pdfbox-0.7.3, tested on Win 2000, JVM 1.4.2.
      The file that causes the error is to big to attach
      (271K zipped).
      Please mail me and I will send it.

      [comment on SourceForge]
      Originally sent by benlitchfield.
      Logged In: YES
      user_id=601708

      see test-1581061.pdf

      [comment on SourceForge]
      Originally sent by benlitchfield.
      Logged In: YES
      user_id=601708

      Thanks, please email the PDF to ben@benlitchfield.com or
      upload to ftp.pdfbox.org

      Thanks,
      Ben

        Activity

        Hide
        tboehme Timo Boehme added a comment -

        This is a rather old report without attached PDF. Thus we cannot check if the problem - if it really is a problem - still exists in current version. Resolve as 'cannot reproduce'.

        Show
        tboehme Timo Boehme added a comment - This is a rather old report without attached PDF. Thus we cannot check if the problem - if it really is a problem - still exists in current version. Resolve as 'cannot reproduce'.

          People

          • Assignee:
            tboehme Timo Boehme
            Reporter:
            Anonymous
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development