Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-872

ERROR org.apache.pdfbox.filter.FlateFilter - Stop reading corrupt stream

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 1.3.1
    • Fix Version/s: 1.4.0
    • Component/s: Parsing
    • Labels:
      None
    • Environment:
      Windows XP [Версия 5.1.2600]
      java version "1.6.0_22"
      Java(TM) SE Runtime Environment (build 1.6.0_22-b04)
      Java HotSpot(TM) Client VM (build 17.1-b03, mixed mode, sharing)

      Description

      This report: http://www2.goldmansachs.com/our-firm/press/press-releases/current/pdfs/2010-q2-earnings.pdf

      With this code:
      public static String getTransformed(InputStream inputStream) {
      PDDocument pdDocument = null;
      String document = null;
      try

      { PDFParser parser = new PDFParser(inputStream); parser.parse(); pdDocument = parser.getPDDocument(); PDFText2HTML pdf2html = new PDFText2HTML("UTF-8"); document = pdf2html.getText(pdDocument); }

      catch (IOException e)

      { e.printStackTrace(); }

      finally {
      if (pdDocument != null) {
      try

      { pdDocument.getDocument().close(); }

      catch (IOException e)

      { e.printStackTrace(); }

      }
      }

      return document;
      }

      returns:
      17:01:15,609 [main] ERROR org.apache.pdfbox.filter.FlateFilter - Stop reading corrupt stream
      null
      java.io.IOException: Error: Expected an integer type, actual=''
      at org.apache.pdfbox.pdfparser.BaseParser.readInt(BaseParser.java:1310)
      at org.apache.pdfbox.pdfparser.PDFObjectStreamParser.parse(PDFObjectStreamParser.java:81)
      at org.apache.pdfbox.cos.COSDocument.dereferenceObjectStreams(COSDocument.java:449)
      at org.apache.pdfbox.pdmodel.PDDocument.openProtection(PDDocument.java:1112)
      at org.apache.pdfbox.pdmodel.PDDocument.decrypt(PDDocument.java:591)
      at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:246)
      at org.apache.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:184)

      in Foxit PDF this file was opened normally

        Attachments

        1. PDFBOX-872.patch
          8 kB
          Martijn Brinkers

          Issue Links

            Activity

              People

              • Assignee:
                adamnichols Adam Nichols
                Reporter:
                vladimir_postrigan Vladimir
              • Votes:
                0 Vote for this issue
                Watchers:
                0 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: