Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-3931

Losing fonts (embedded subset) when merge documents with PDFMergerUtility

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: 2.0.7
    • Fix Version/s: None
    • Component/s: PDModel, Utilities
    • Labels:
      None

      Description

      Story:
      I want to merge two PdDocument with:

      PDFMergerUtility#appendDocument(PDDocument destination, PDDocument source)
      

      Both documents created from scratch in java. I open PDPageContentStream for each document, add some text and then close PDPageContentStream. For each document I used PdFont which declared by next code:

      PDFont getFont(PdDocument document) {
          InputStream fontStream = Thread.currentThread().getContextClassLoader().getResourceAsStream("font/Calibri.ttf");
          return PDType0Font.load(ctx.getDocument(), fontStream, true);
      }
      // Note that subset flag is true
      

      Then I merge documents:

      PDFMergerUtility.appendDocument(document1, document2);
      

      Then close document2:

      document2.close();
      

      And save document1 to OutputStream:

      document1.save(someOutputStream);
      

      Expected results:
      I get pdf file with all fonts embedded as subset.

      Actual result:
      Font is embeded correctly only for pages created with document1, pages created with document2 are present, but no embed font for them.
      As a result if I open created pdf file in OS which has Calibri.ttf I see correct font on all pages, if Calibri.ttf is not exist then font is correct only on pages created with document1.

      Used workaround:
      I see that PdDocument has field:

      // fonts to subset before saving
      private final Set<PDFont> fontsToSubset = new HashSet<PDFont>();
      

      fonts are added to this field when client call:

      PDPageContentStream#setFont(PdFont font, float fontSize)
      

      and actual embedding happens in method:

      PdDocument#save(OutputStream output);
      

      In my example above, method save is never called for document2.
      We append docuement2 to document1 and save only document1.

      I reviewed method:

      PDFMergerUtility#appendDocument(PDDocument destination, PDDocument source)
      

      And I did not find that this method do something with fontsToSubset field.
      So I create next method:

      @SuppressWarnings("unchecked")
      private static void subsetFonts(final PDDocument document) {
          try {
              Field fontsToSubsetField = document.getClass().getDeclaredField("fontsToSubset");
              fontsToSubsetField.setAccessible(true);
              Set<PDFont> fontsToSubset = (Set<PDFont>) fontsToSubsetField.get(document);
              for (PDFont font : fontsToSubset) {
                  font.subset();
              }
          } catch (NoSuchFieldException | IOException | IllegalAccessException | ClassCastException e) {
              LOGGER.warn("Error when subset embedded fonts into pdf document", e);
          }
      }
      

      And used it before merge documents:

      subsetFonts(document2);
      mergerUtility.appendDocument(document1, document2);
      

      (I need to use some Reflection because fontsToSubset is private part of PdDocument)

      I think other and maybe better option maybe:

      docuement1.fontsToSubset.addAll(docuement2fontsToSubset);
      

      But I did not tested this option.

      Conclusion:
      I think this problem should be solved on library side in PDFMergerUtility#appendDocument method, and not in client code. Or we should have javadoc which tells that we should use PDFMergerUtility#appendDocument only for saved PdDocument

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                njdub Nazar Dub
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: