Description
I'm trying to reproduce a problem mentioned in the user mailing list ("International characters only show correctly when form field is selected") with a confidential file that isn't shared. We know from screenshots is that it is a type 1 font with DictionaryEncoding. The only file I found is the one from PDFBOX-1084.
The original file was changed with this code so that all text fields have an embedded font with DictionaryEncoding that isn't used in the original form:
try (PDDocument doc = PDDocument.load(new File("PDFBOX-1084.pdf"))) { doc.getDocumentCatalog().setViewerPreferences(null); PDAcroForm acroForm = doc.getDocumentCatalog().getAcroForm(); for (PDField field : acroForm.getFieldTree()) { if (field instanceof PDTextField ) { PDTextField tf = (PDTextField) field; String da = tf.getDefaultAppearance(); if (da.startsWith("/HeBo")) { tf.setDefaultAppearance("/HelveticaNeue-Italic" + da.substring(5)); field.setValue(field.getPartialName()); } } } doc.save("PDFBOX-1084-mod.pdf"); }
On the modified file, this code was run:
acroForm.getField("f1_09(0)").setValue("Stanisław äöüÄÖÜß");
PDFBox shows the field content but the ł is hard to see. Adobe Reader doesn't show the ł, and when clicking on the field, it shows the ł but not the german umlauts. Amusingly, if I copy & paste the mess that I get, it's umlauts again.