PDFBox
  1. PDFBox
  2. PDFBOX-954

Embedded font: value for /Widths faulty (worked in PDFBox 1.3.0!)

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.4.0
    • Fix Version/s: 1.7.1
    • Component/s: FontBox
    • Labels:
      None
    • Environment:
      JDK1.6.0_23, Windows XP

      Description

      We have a problem with the font 'LucidiaSansUnicode (l_10646.ttf). It is embedded in a PDF and when viewing this PDF (with Acrobat Reader 9), an error

      In der Schrift "LucidaSansUnicode" ist der Wert für /Widths fehlerhaft.

      occurs (roughly translated: "In font "LucidaSansUNicode" the value for /Widths is faulty."). I noticed that this error only occurs when the first page is displayed that has text added by PDFBox! The same font is also used for all other text (used by Apache FOP to generate). When I look at the dialog window of Acrobat 3. tab "Fonts", I notice lots of entries

      LucidaSansUnicode (Eingebettete Untergruppe)
      Typ: TrueType (CID)
      Kodierung: Identity-H

      but only 1 entry at the very top that looks different:

      LucidaSansUnicode (Eingebettet)
      Typ: TrueType
      Kodierung: Ansi

      I guess one is from Apache FOP (generation of PDF) and one is from PDFBox (adding additional text to the PDF). However, both use the same source file "l_10646.ttf"!

      Using PDFBox 1.3.0-snapshot (or iText 2.1.7), this problem does NOT occur!

      This only occurs with this "LucidaSansUnicode" font - all our other custom fonts don't cause this problem.

      The difference I notice in Acrobat Reader Fonts tab is the first font entry:

      PDFBox 1.4.0:

      LucidaSansUnicode (Eingebettet)
      Typ: TrueType
      Kodierung: Ansi

      PDFBox 1.3.0 or iText 2.1.7:

      LucidaSansUnicode (Eingebettete Untergruppe)
      Typ: TrueType
      Kodierung: Ansi

      So, PDFBox 1.4.0 only shows "embedded" ("Eingebettet") but PDFBox 1.3.0/iText version shows "embedded subgroup" ("Eingebettete Untergruppe")! Perhaps this is the problem?

      1. hello_ttf_1.1.0.pdf
        416 kB
        Bob Swanson
      2. hello_ttf_1.4.0.pdf
        428 kB
        Bob Swanson
      3. Imagen 3.png
        42 kB
        David Villace
      4. Imagen 2.png
        53 kB
        David Villace
      5. Imagen 1.png
        16 kB
        David Villace
      6. out.pdf
        438 kB
        David Villace
      7. Main.java
        2 kB
        David Villace
      8. MainVer2.java
        4 kB
        David Villace
      9. outVer2.pdf
        416 kB
        David Villace
      10. MainVer2.java
        4 kB
        David Villace
      11. pdfbox-1.7.0-ttf-widths-encoding-fix.patch
        9 kB
        Wolfgang Glas

        Issue Links

          Activity

          Hide
          Bob Swanson added a comment -

          Tried with MAC OSX and was able to remove the work-around. Seems to be working fine! Many Thanks. Danke.

          Show
          Bob Swanson added a comment - Tried with MAC OSX and was able to remove the work-around. Seems to be working fine! Many Thanks. Danke.
          Hide
          Hesham added a comment -

          I have tested this on Windows & Mac OS X, and it works fine.
          Thanks Wolfgang ... Thanks Andreas

          Show
          Hesham added a comment - I have tested this on Windows & Mac OS X, and it works fine. Thanks Wolfgang ... Thanks Andreas
          Hide
          Andreas Lehmkühler added a comment -

          I ran some additional tests and everything works fine for me, set to resolved.

          Thanks for the contribution!

          Show
          Andreas Lehmkühler added a comment - I ran some additional tests and everything works fine for me, set to resolved. Thanks for the contribution!
          Hide
          Andreas Lehmkühler added a comment -

          I experienced some problems with the first patch, but the improved one now works fine. Added in revision 1356866.

          Show
          Andreas Lehmkühler added a comment - I experienced some problems with the first patch, but the improved one now works fine. Added in revision 1356866.
          Hide
          Wolfgang Glas added a comment -

          Improved version of my patch, which now precisely follows the specs in PDFReference16.pdf, p.401

          Show
          Wolfgang Glas added a comment - Improved version of my patch, which now precisely follows the specs in PDFReference16.pdf, p.401
          Hide
          Wolfgang Glas added a comment -

          The attached patch fixes the problem for fonts with a valid 'post' table. Further work might be needed in order to use a (3,1)-cmap as described in PDFReference1.6.pdf, p. 401

          Show
          Wolfgang Glas added a comment - The attached patch fixes the problem for fonts with a valid 'post' table. Further work might be needed in order to use a (3,1)-cmap as described in PDFReference1.6.pdf, p. 401
          Hide
          Hesham added a comment -

          I have tested this with PDFBox v1.7 using PDTrueTypeFont.loadTTF(...) and this issue still occurs for any ttf file i use.

          Show
          Hesham added a comment - I have tested this with PDFBox v1.7 using PDTrueTypeFont.loadTTF(...) and this issue still occurs for any ttf file i use.
          Hide
          David Villace added a comment - - edited

          Hello.

          I think I found a temporal workaround.

          Finally I see on PDF specification the meaning of "widths" array. As I said on my first post, the problem with the contents of the array seems to be that the contents is the widths of all characters defined on the selected font. In my opinion, the correct contents must be the widths of the all characters between the first character and the last character detected at them moment you want, by example, write a string.

          I attach a variant of the program that I attach on my previous post. The result is a PDF file with 3 strings, each one with it's own font and font descriptor but all three share the value of the attribute "FontFile2". The result is a PDF file without the message about error and it has embedded Arial font only one time. Previous tests had created a PDF file with the font embedded three times... a really big file.

          The files I attach are:

          • MainVer2.java (2nd version, the first it's a wrong version of the test)
          • outVer2.pdf

          I see the previous attached files (Main.java) has not treated the character "ñ", I erase it and the spanish pseudo-character "ll" too.

          Oh! This workaround woks fine on PDFBox 1.6.0 and 1.7.0 (trunk)

          Good afternoon

          Show
          David Villace added a comment - - edited Hello. I think I found a temporal workaround. Finally I see on PDF specification the meaning of "widths" array. As I said on my first post, the problem with the contents of the array seems to be that the contents is the widths of all characters defined on the selected font. In my opinion, the correct contents must be the widths of the all characters between the first character and the last character detected at them moment you want, by example, write a string. I attach a variant of the program that I attach on my previous post. The result is a PDF file with 3 strings, each one with it's own font and font descriptor but all three share the value of the attribute "FontFile2". The result is a PDF file without the message about error and it has embedded Arial font only one time. Previous tests had created a PDF file with the font embedded three times... a really big file. The files I attach are: MainVer2.java (2nd version, the first it's a wrong version of the test) outVer2.pdf I see the previous attached files (Main.java) has not treated the character "ñ", I erase it and the spanish pseudo-character "ll" too. Oh! This workaround woks fine on PDFBox 1.6.0 and 1.7.0 (trunk) Good afternoon
          Hide
          David Villace added a comment - - edited

          Hello!

          I attach some files about the "test":

          • Imagen1.png : The error message on spanish (I'm from Catalonia
          • Imagen 2.png: The Adobe Reader version
          • Imagen 3.png: About my Mac
          • out.pdf : The generated PDF file
          • Main.java : The program

          And the PDFBox version is 1.7.0 (trunk)

          Good night from Barcelona

          Show
          David Villace added a comment - - edited Hello! I attach some files about the "test": Imagen1.png : The error message on spanish (I'm from Catalonia Imagen 2.png: The Adobe Reader version Imagen 3.png: About my Mac out.pdf : The generated PDF file Main.java : The program And the PDFBox version is 1.7.0 (trunk) Good night from Barcelona
          Hide
          Andreas Lehmkühler added a comment -

          @David
          Can you be a little bit more specific, please? What font did you use? Please attach the resulting pdf to this issue.

          Show
          Andreas Lehmkühler added a comment - @David Can you be a little bit more specific, please? What font did you use? Please attach the resulting pdf to this issue.
          Hide
          David Villace added a comment -

          Hello.

          I try to create a simple PDF with trunk branch code (1.7.0) and the problem remains. The PDF has not any thing special. It's the tipical example of the PDFBox project website.

          When I open the document with Adobe Reader the message about wrong widths appears.

          Good night everybody.

          Show
          David Villace added a comment - Hello. I try to create a simple PDF with trunk branch code (1.7.0) and the problem remains. The PDF has not any thing special. It's the tipical example of the PDFBox project website. When I open the document with Adobe Reader the message about wrong widths appears. Good night everybody.
          Hide
          David Villace added a comment -

          Hello, my name is David.
          I try to apply the changes suggested on classes "OS2WindowsMetricsTable" and "PDTrueTypeFont" and the problem continues. The widths doesn't match.
          I'm new in the PDF world but... The contents of the array of widths must be , effectively, an array of widths between the first character and the last character.
          I suspect the problem resides into the fact that the contents of the array must be the widths of the characters (glyphs) implied into the text represented, not all specified characters (glyphs) between the first character and the last character into specification file of true type font (file ".ttf").

          Show
          David Villace added a comment - Hello, my name is David. I try to apply the changes suggested on classes "OS2WindowsMetricsTable" and "PDTrueTypeFont" and the problem continues. The widths doesn't match. I'm new in the PDF world but... The contents of the array of widths must be , effectively, an array of widths between the first character and the last character. I suspect the problem resides into the fact that the contents of the array must be the widths of the characters (glyphs) implied into the text represented, not all specified characters (glyphs) between the first character and the last character into specification file of true type font (file ".ttf").
          Hide
          Andreas Lehmkühler added a comment -

          The fix is available in the current trunk only.

          Show
          Andreas Lehmkühler added a comment - The fix is available in the current trunk only.
          Hide
          MH added a comment -

          STill happens with PDFBox 1.6.0.

          Show
          MH added a comment - STill happens with PDFBox 1.6.0.
          Hide
          Andreas Lehmkühler added a comment -

          I (hopefully) fixed the calculation of the /Widths value in revision 1167012.

          Please run some additional tests as I only tested a couple of ttf fonts with different configurations.

          The next thing I'd like to fix is the used encoding for those ttf fonts, see PDFBOX-922 for further details.

          Show
          Andreas Lehmkühler added a comment - I (hopefully) fixed the calculation of the /Widths value in revision 1167012. Please run some additional tests as I only tested a couple of ttf fonts with different configurations. The next thing I'd like to fix is the used encoding for those ttf fonts, see PDFBOX-922 for further details.
          Hide
          Andreas Lehmkühler added a comment -

          I'm working on this issue.

          At first I fixed the calculation of the family class in revision 1166824.

          Show
          Andreas Lehmkühler added a comment - I'm working on this issue. At first I fixed the calculation of the family class in revision 1166824.
          Hide
          Hesham added a comment -

          Will this issue be fixed in the next program version ? I see this is a critical issue !

          Show
          Hesham added a comment - Will this issue be fixed in the next program version ? I see this is a critical issue !
          Hide
          Stephen Blackwell added a comment -

          I've been getting the same problem with all TTF fonts on all versions >= 1.2.0, including the 1.6.0 snapshots.

          The change between version 1.1.0 and 1.2.0 that started the issue is in the following line in loadDescriptorDictionary() in org.apache.pdfbox.pdmodel.font.PDTrueTypeFont.java:

          // version 1.1.0
          int maxWidths = 256;

          // version 1.2.0 and later
          int maxWidths = glyphToCCode.length;

          I've tried setting maxWidths to difference values manually, but 256 seems to be the max allowable. Setting it to 257 will reproduce the error.

          Is Adobe Reader really limited to 256 character widths on embedded fonts?

          Show
          Stephen Blackwell added a comment - I've been getting the same problem with all TTF fonts on all versions >= 1.2.0, including the 1.6.0 snapshots. The change between version 1.1.0 and 1.2.0 that started the issue is in the following line in loadDescriptorDictionary() in org.apache.pdfbox.pdmodel.font.PDTrueTypeFont.java: // version 1.1.0 int maxWidths = 256; // version 1.2.0 and later int maxWidths = glyphToCCode.length; I've tried setting maxWidths to difference values manually, but 256 seems to be the max allowable. Setting it to 257 will reproduce the error. Is Adobe Reader really limited to 256 character widths on embedded fonts?
          Hide
          Bob Swanson added a comment -

          "hello 1.1.0" reads just fine with Adobe Reader, the "hello 1.4.0" gets the warning error. Font is Arial.TTF from Mac. Adobe Reader warning occurs on Mac or Windows 7.

          Show
          Bob Swanson added a comment - "hello 1.1.0" reads just fine with Adobe Reader, the "hello 1.4.0" gets the warning error. Font is Arial.TTF from Mac. Adobe Reader warning occurs on Mac or Windows 7.
          Hide
          Bob Swanson added a comment -

          I have also pointed out this issue. I can reproduce the problem with versions of PDFBox after 1.1.0. I am using Arial.TTF from the Mac. I have tried to view the files on my wife's Windows 7 machine, and get the same warning errors in Adobe Reader, so at least the Adobe Reader does not depend on being on a Mac. Other postings indicate that the TTF files from Windows cause the same message to appear.

          I have two PDF files I created using the canned version of "org.apache.pdfbox.examples.pdmodel.HelloWorldTTF" that comes with the distribution. This process eliminated any possibility that my own PDF-creation software was causing the problem. I ran the same command line, using the 1.1.0 and 1.4.0 JAR files (and some versions in between).

          The file from 1.1.0 reads just fine, but those created by any JAR after 1.1.0 all cause the same warning error. (I have not been able to download 1.5.0 yet, but others' experiences seem to point to the same issue).

          Show
          Bob Swanson added a comment - I have also pointed out this issue. I can reproduce the problem with versions of PDFBox after 1.1.0. I am using Arial.TTF from the Mac. I have tried to view the files on my wife's Windows 7 machine, and get the same warning errors in Adobe Reader, so at least the Adobe Reader does not depend on being on a Mac. Other postings indicate that the TTF files from Windows cause the same message to appear. I have two PDF files I created using the canned version of "org.apache.pdfbox.examples.pdmodel.HelloWorldTTF" that comes with the distribution. This process eliminated any possibility that my own PDF-creation software was causing the problem. I ran the same command line, using the 1.1.0 and 1.4.0 JAR files (and some versions in between). The file from 1.1.0 reads just fine, but those created by any JAR after 1.1.0 all cause the same warning error. (I have not been able to download 1.5.0 yet, but others' experiences seem to point to the same issue).

            People

            • Assignee:
              Andreas Lehmkühler
              Reporter:
              MH
            • Votes:
              4 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development