Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
2.0.18, 2.0.19
-
None
Description
Issue:
I tried to render PDFs, that contain embedded chinese fonts. Neither the PDF Debugger, nor printouts of the document (PDFPrintable), nor the PDFRenderer can display/render the chinese glyphs correctly and will render placeholders instead.
Assumptions:
I assume, that said embedded fonts are incomplete and don't contain all glyphs, that would be required to render the text properly and therefore PDFbox attempts to use the previously determined fallback font. (!?)
And fails to find the glyphs in said fallback font.
Which is not surprising, as the Fallback font "MalgunGothic-Semilight" (Windows standard font) does not contain chinese characters.
Debugging:
I tried to understand how the fallback font is determined and what could be done to solve this problem on my end. But I was unable to find a satisfying solution.
My best guess so far is, that the CIDFontMapping (FontMapperImpl) is to blame for determining an unfit fallback font.
Although it seems to check, whether required codepages are contained in a fallback font, it still does rank the Malgun font as the topscorer and best substitute font, even though it does clearly not contain all required codepages.
My opinion:
This is troubling, as better fit fonts exist and could have been selected. (ie.: Adobe Stong Std) And are indeed included in the CIDFontMapping, but seemingly are scoring lower for some reason.
Further information:
I can not disclose the document in question, however I found a document (pdf_font-zhcn.pdf) in another issue (PDFBOX-3132), that can be used to reproduce the issue (ie.: by dropping it into the PDF Debugger)