Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-2976

java.util.zip.DataFormatException: incorrect data check

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 2.0.0
    • 1.8.11, 2.0.0
    • Parsing
    • None
    • Linux Mint 17.2 x64, JDK7u79, Glassfish 3.1.2.2

    Description

      When trying to open certain PDF files (examples attached, also any MSDS available at http://www.scbt.com/datasheet-356376.html ), an expection is thrown resulting in the file not being parsed:
      java.io.IOException: java.util.zip.DataFormatException: incorrect data check
      at org.apache.pdfbox.filter.FlateFilter.decode(FlateFilter.java:83)
      at org.apache.pdfbox.cos.COSInputStream.create(COSInputStream.java:78)
      at org.apache.pdfbox.cos.COSStream.createInputStream(COSStream.java:160)
      at org.apache.pdfbox.cos.COSStream.getUnfilteredStream(COSStream.java:143)
      at org.apache.pdfbox.pdmodel.PDPage.getContents(PDPage.java:148)
      at org.apache.pdfbox.pdfparser.PDFStreamParser.<init>(PDFStreamParser.java:92)
      at org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:450)
      at org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:437)
      at org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngine.java:148)
      at org.apache.pdfbox.text.PDFTextStreamEngine.processPage(PDFTextStreamEngine.java:117)
      at org.apache.pdfbox.text.PDFTextStripper.processPage(PDFTextStripper.java:367)
      at org.apache.pdfbox.text.PDFTextStripper.processPages(PDFTextStripper.java:303)
      at org.apache.pdfbox.text.PDFTextStripper.writeText(PDFTextStripper.java:248)
      at org.apache.pdfbox.text.PDFTextStripper.getText(PDFTextStripper.java:209)

      – or –

      java.io.IOException: java.util.zip.DataFormatException: incorrect data check
      at org.apache.pdfbox.filter.FlateFilter.decode(FlateFilter.java:83)
      at org.apache.pdfbox.cos.COSInputStream.create(COSInputStream.java:78)
      at org.apache.pdfbox.cos.COSStream.createInputStream(COSStream.java:160)
      at org.apache.pdfbox.cos.COSStream.getUnfilteredStream(COSStream.java:143)
      at org.apache.pdfbox.pdmodel.PDPage.getContents(PDPage.java:148)
      at org.apache.pdfbox.pdfparser.PDFStreamParser.<init>(PDFStreamParser.java:92)
      at org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:450)
      at org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:437)
      at org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngine.java:148)
      at org.apache.pdfbox.rendering.PageDrawer.drawPage(PageDrawer.java:179)
      at org.apache.pdfbox.rendering.PDFRenderer.renderPage(PDFRenderer.java:205)
      at org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:136)

      Attachments

        1. sc-356376(1).pdf
          60 kB
          Felix Rudolphi
        2. sc-356376.pdf
          56 kB
          Felix Rudolphi
        3. sc-356376(1)-x.pdf
          60 kB
          Felix Rudolphi
        4. sc-356376-x.pdf
          55 kB
          Felix Rudolphi
        5. PDFBOX2976_FlateFilter2.patch
          2 kB
          Andreas Lehmkühler
        6. 500 ml (500.0) - Bisomer® MPEG350MA - 26915-72-0 - IVW_ 444 Oberflächentechnik.pdf
          88 kB
          Felix Rudolphi

        Issue Links

          Activity

            People

              lehmi Andreas Lehmkühler
              chemFelix Felix Rudolphi
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - 3h
                  3h
                  Remaining:
                  Remaining Estimate - 3h
                  3h
                  Logged:
                  Time Spent - Not Specified
                  Not Specified