Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-4152

Glyphs don't appear properly in form with embedded type1 font with DictionaryEncoding

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.0.8
    • None
    • AcroForm

    Description

      I'm trying to reproduce a problem mentioned in the user mailing list ("International characters only show correctly when form field is selected") with a confidential file that isn't shared. We know from screenshots is that it is a type 1 font with DictionaryEncoding. The only file I found is the one from PDFBOX-1084.

      The original file was changed with this code so that all text fields have an embedded font with DictionaryEncoding that isn't used in the original form:

              try (PDDocument doc = PDDocument.load(new File("PDFBOX-1084.pdf")))
              {
                  doc.getDocumentCatalog().setViewerPreferences(null);
                  PDAcroForm acroForm = doc.getDocumentCatalog().getAcroForm();
                  for (PDField field : acroForm.getFieldTree())
                  {
                      if (field instanceof  PDTextField )
                      {
                          PDTextField tf = (PDTextField) field;
                          String da = tf.getDefaultAppearance();
                          if (da.startsWith("/HeBo"))
                          {
                              tf.setDefaultAppearance("/HelveticaNeue-Italic" + da.substring(5));
                              field.setValue(field.getPartialName());
                          }
                      }
                  }
                  doc.save("PDFBOX-1084-mod.pdf");
              }
      

      On the modified file, this code was run:

      acroForm.getField("f1_09(0)").setValue("Stanisław äöüÄÖÜß");
      

      PDFBox shows the field content but the ł is hard to see. Adobe Reader doesn't show the ł, and when clicking on the field, it shows the ł but not the german umlauts. Amusingly, if I copy & paste the mess that I get, it's umlauts again.

      Attachments

        1. PDFBOX-1084-mod-after.pdf
          240 kB
          Tilman Hausherr
        2. PDFBOX-1084-mod-before.pdf
          240 kB
          Tilman Hausherr

        Activity

          People

            Unassigned Unassigned
            tilman Tilman Hausherr
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: