Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-1522

Some PDF files are causing exception (java.io.IOException: Error: Could not find font(COSName{F53.0}) in map=)

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 1.7.1
    • 1.8.0
    • Utilities
    • None
    • RHEL 6

    Description

      I am using PDFBox 1.7.1 and when parsing some PDF files, it is throwing exceptions and it's filling the Tomcat log very quickly (100MB in few seconds). There was another bug filed related to this issue. I tried the patch supplied in that bug but the issue is still there. I want to mention that the text gets extracted successfully from the PDF. But it just throws a log of WARN messages in the logs. As a workaround, I have set the LOG level to ERROR to avoid those WARN messages.

      Here is the problematic PDF file:
      http://doratst.uark.edu/fedora/repository/default%3A1590/OBJ/Traveler20120822.pdf

      Related bug:
      https://issues.apache.org/jira/browse/PDFBOX-1359#comment-13584669

      I am getting the following exception:

      WARN 2013-02-22 14:41:19,519 (PDFStreamEngine) java.lang.NullPointerException
      java.lang.NullPointerException
      WARN 2013-02-22 14:41:19,519 (PDFStreamEngine) java.lang.NullPointerException
      java.lang.NullPointerException
      WARN 2013-02-22 14:41:19,519 (PDFStreamEngine) java.io.IOException: Error: Could not find font(COSName

      {F53.0}) in map={F50.1=org.apache.pdfbox.pdmodel.font.PDType1Font@50246923, F51.0=org.apache.pdfbox.pdmodel.font.PDType1Font@672a1f0}
      java.io.IOException: Error: Could not find font(COSName{F53.0}

      ) in map=

      {F50.1=org.apache.pdfbox.pdmodel.font.PDType1Font@50246923, F51.0=org.apache.pdfbox.pdmodel.font.PDType1Font@672a1f0}

      at org.apache.pdfbox.util.operator.SetTextFont.process(SetTextFont.java:57)
      at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:556)
      at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:270)
      at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:237)
      at org.apache.pdfbox.util.operator.Invoke.process(Invoke.java:67)
      at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:556)
      at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:270)
      at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:237)
      at org.apache.pdfbox.util.operator.Invoke.process(Invoke.java:67)
      at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:556)
      at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:270)
      at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:237)
      at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:217)
      at org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:448)
      at org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:372)
      at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:328)
      at org.apache.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:247)
      at dk.defxws.fedoragsearch.server.TransformerToText.getTextFromPDF(TransformerToText.java:335)
      at dk.defxws.fedoragsearch.server.TransformerToText.getText(TransformerToText.java:194)
      at dk.defxws.fedoragsearch.server.GenericOperationsImpl.getDatastreamText(GenericOperationsImpl.java:668)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at java.lang.reflect.Method.invoke(Method.java:597)
      at org.apache.xalan.extensions.ExtensionHandlerJavaClass.callFunction(ExtensionHandlerJavaClass.java:399)
      at org.apache.xalan.extensions.ExtensionHandlerJavaClass.callFunction(ExtensionHandlerJavaClass.java:438)
      at org.apache.xalan.extensions.ExtensionsTable.extFunction(ExtensionsTable.java:220)
      at org.apache.xalan.transformer.TransformerImpl.extFunction(TransformerImpl.java:473)
      at org.apache.xpath.functions.FuncExtFunction.execute(FuncExtFunction.java:206)
      at org.apache.xpath.Expression.executeCharsToContentHandler(Expression.java:311)

      Attachments

        Activity

          People

            lehmi Andreas Lehmkühler
            dukuonline Diwakar Timilsina
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: