Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-2497

GRAVE: FlateFilter: stop reading corrupt stream due to a DataFormatException

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Minor
    • Resolution: Not A Problem
    • Affects Version/s: 1.8.6, 1.8.7
    • Fix Version/s: 1.8.8
    • Component/s: Text extraction
    • Labels:
      None
    • Environment:
      java version "1.7.0_65"
      Java(TM) SE Runtime Environment (build 1.7.0_65-b17)
      Java HotSpot(TM) 64-Bit Server VM (build 24.65-b04, mixed mode)
      Linux 3.2.0-4-amd64 #1 SMP Debian 3.2.60-1+deb7u3 x86_64 GNU/Linux

      Description

      java -jar pdfbox-app-1.8.7.jar ExtractText git-community-book.pdf git-community-book.txt

      throws the error :

      nov. 13, 2014 5:34:31 PM org.apache.pdfbox.filter.FlateFilter decode
      GRAVE: FlateFilter: stop reading corrupt stream due to a DataFormatException

      The txt document is incomplete, stops at
      Chapter 7: Fonctionnement Interne et Plomberie
      133
      Git Community Book
      134

      PdfDebugger does not show any problem of structure of git-community-book.pdf.

      PDFSplit -split 1 fails at git-community-book-53.pdf (partially written)

      java -jar pdfbox-app-1.8.7.jar PDFDebugger git-community-book-53.pdf
      Xlib: extension "RANDR" missing on display ":0.0".
      nov. 13, 2014 5:30:32 PM org.apache.pdfbox.pdfparser.BaseParser parseCOSDictionary
      AVERTISSEMENT: Bad Dictionary Declaration org.apache.pdfbox.io.PushBackInputStream@78900845
      nov. 13, 2014 5:30:32 PM org.apache.pdfbox.pdfparser.BaseParser parseCOSDictionary
      AVERTISSEMENT: Invalid dictionary, found: '���' but expected: '/'
      nov. 13, 2014 5:30:32 PM org.apache.pdfbox.pdfparser.XrefTrailerResolver setStartxref
      AVERTISSEMENT: Did not found XRef object at specified startxref position 0

        Attachments

        1. git-community-book-53.pdf
          122 kB
          Laurent Roger
        2. git-community-book-52.pdf
          436 kB
          Laurent Roger
        3. git-community-book.txt
          144 kB
          Laurent Roger
        4. git-community-book.pdf
          687 kB
          Laurent Roger

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              lc3t35 Laurent Roger
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: