Details
-
Improvement
-
Status: Closed
-
Minor
-
Resolution: Incomplete
-
2.0.7
-
None
-
None
Description
I ran into this issue while processing a pdf file through elasticsearch and it turns out that the error was because [the method is not implemented|
https://apache.googlesource.com/pdfbox/+/refs/heads/trunk/fontbox/src/main/java/org/apache/fontbox/ttf/CmapSubtable.java#327]
Below is an a snippet of stack trace I ran into.
Is there any plan to implementing this method?
An error occured when reading table cmap
java.io.IOException: CMap subtype 14 not yet implemented
at org.apache.fontbox.ttf.CMAPEncodingEntry.processSubtype14(CMAPEncodingEntry.java:304)
at org.apache.fontbox.ttf.CMAPEncodingEntry.initSubtable(CMAPEncodingEntry.java:114)
at org.apache.fontbox.ttf.CMAPTable.initData(CMAPTable.java:100)
at org.apache.fontbox.ttf.TrueTypeFont.initializeTable(TrueTypeFont.java:280)
at org.apache.fontbox.ttf.AbstractTTFParser.parseTables(AbstractTTFParser.java:128)
at org.apache.fontbox.ttf.TTFParser.parseTables(TTFParser.java:80)
at org.apache.fontbox.ttf.AbstractTTFParser.parseTTF(AbstractTTFParser.java:109)
at org.apache.fontbox.ttf.TTFParser.parseTTF(TTFParser.java:25)
at org.apache.fontbox.ttf.AbstractTTFParser.parseTTF(AbstractTTFParser.java:84)
at org.apache.fontbox.ttf.TTFParser.parseTTF(TTFParser.java:25)
at org.apache.pdfbox.pdmodel.font.PDTrueTypeFont.getTTFFont(PDTrueTypeFont.java:632)
at org.apache.pdfbox.pdmodel.font.PDTrueTypeFont.getFontWidth(PDTrueTypeFont.java:673)
at org.apache.pdfbox.pdmodel.font.PDSimpleFont.getFontWidth(PDSimpleFont.java:231)
at org.apache.pdfbox.pdmodel.font.PDSimpleFont.getSpaceWidth(PDSimpleFont.java:533)
at org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:355)
at org.apache.pdfbox.util.operator.ShowTextGlyph.process(ShowTextGlyph.java:62)
at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:557)
at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:268)
at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:235)
at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:215)
at org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:458)
at org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:383)
at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:342)
at org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:148)
at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:148)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
at org.apache.tika.Tika.parseToString(Tika.java:537)
Attachments
Issue Links
- relates to
-
PDFBOX-2524 [PATCH] Two PDFont to create PDF documents in CJK and non-ISO-8859-1 languages
- Closed