Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-5476

Error: Expected operator 'ID' actual='In' at stream offset 142897 []" error occurs in some pdf

VotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Invalid
    • 2.0.26
    • None
    • Rendering
    • None

    Description

      Hi,

      While we upload some PDF, we encounter an error like this : "Error: Expected operator 'ID' actual='In' at stream offset 142897 []"

       

      We used 2.0.25 pdfbox version and we also tried 2.0.26 and it will also work fine in some pdf, but not with others.

       

      Code :

      public static boolean extractFirstPdfPageAsImageJPEG(final File sourcePdf, final File resultImg,

                                                final Integer maxWidth, final Integer maxHeight) {

                                  try (final PDDocument document = PDDocument.load(sourcePdf)) {

                                                final PDFRenderer pdfRenderer = new PDFRenderer(document);

                                                final BufferedImage extractedImage = pdfRenderer.renderImageWithDPI(0, 100, ImageType.RGB);

       

                                                final int originalHeight = extractedImage.getHeight();

                                                final int originalWidth = extractedImage.getWidth();

       

                                                int scaledHeight = originalHeight;

                                                int scaledWidth = originalWidth;

       

                                                if (originalWidth > maxWidth) {

                                                               scaledWidth = maxWidth;

                                                               scaledHeight = scaledWidth * originalHeight / originalWidth;

                                                               if (scaledHeight > maxHeight)

      {                                                                        scaledHeight = maxHeight;                                                                        scaledWidth = scaledHeight * originalWidth / originalHeight;                                                          }

                                                } else if (originalHeight > maxHeight)

      {                                                          scaledHeight = maxHeight;                                                          scaledWidth = scaledHeight * originalWidth / originalHeight;                                           }

       

                                                // creates output image

                                                final BufferedImage resizedImage = new BufferedImage(scaledWidth, scaledHeight, extractedImage.getType());

       

                                                final Graphics2D g2d = resizedImage.createGraphics();

                                                g2d.drawImage(extractedImage, 0, 0, scaledWidth, scaledHeight, null);

                                                g2d.dispose();

                                                ImageIO.write(resizedImage, "JPEG", resultImg);

                                                return true;

                                  } catch (final IOException e)

      {                                           LOG.error(e.getMessage(), e);                                           return false;                             }

                    }

       

        this pdfRenderer.renderImageWithDPI(0, 100, ImageType.RGB) method will generate error :

       

      Error: Expected operator 'ID' actual='In' at stream offset 142897 []

      java.io.IOException: Error: Expected operator 'ID' actual='In' at stream offset 142897

                    at org.apache.pdfbox.pdfparser.PDFStreamParser.parseNextToken(PDFStreamParser.java:280) ~[pdfbox-2.0.25.jar:2.0.25]

                    at org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:521) ~[pdfbox-2.0.25.jar:2.0.25]

                    at org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:492) ~[pdfbox-2.0.25.jar:2.0.25]

                    at org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngine.java:155) ~[pdfbox-2.0.25.jar:2.0.25]

                    at org.apache.pdfbox.rendering.PageDrawer.drawPage(PageDrawer.java:282) ~[pdfbox-2.0.25.jar:2.0.25]

                    at org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:355) ~[pdfbox-2.0.25.jar:2.0.25]

                    at org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:272) ~[pdfbox-2.0.25.jar:2.0.25]

                    at org.apache.pdfbox.rendering.PDFRenderer.renderImageWithDPI(PDFRenderer.java:258) ~[pdfbox-2.0.25.jar:2.0.25]

       

      Is it a known bug ? Do you know when it will be fixed ? 

      Thanks a lot,

      Regards.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            elodie.lebouvier lebouvier
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment