Dear developers, I know the font system is currently undergoing changes. Please excuse this if this gets fixed soon. When rendering a document containing "special" characters to pdf, they get shown as # instead of the right character. (fop svn, current checkout) For a full example, see the attached test file. Example: test ∑ test should render: test E test (where E is the sigma character). On the pdf output I get: test # test AWT output works fine. I don't know if this is specific to a Mac system. I have no user fonts installed. (What it should do is figure out that the standard font does not have the SIGMA character and switch to the symbol font). If there is no one working on it and you point me in the right direction I may be able to provide a patch. Max
Created attachment 18189 [details] A small test file using the summation (sigma) character
What you describe is a better coverage of the behaviour described by the "font-selection-strategy" property. It would really be good if we had that. You might want to check with Vincent Hennebert if he hasn't already done work in this area. If he hasn't this should clash too much with Vincent's work. I haven't given much thought to how this would have to be implemented. I assume the FOText objects (o.a.fop.fo package) would have to be split after the right font for each snippet has been determined. But I'm not sure if this is enough.
In am not sure if fo-tree is the right place to fix this. I've also noticed the same behavior when an SVG graphics contains a SIGMA sign. It renders fine in the AWT output (squiggle, or AWT in fop), but not in the PDF version.
This is because the default fonts used for AWT output have more glyphs than the Base14 fonts used for PDF output. And yes this is a font-selection-strategy issue. I haven't studied it yet in detail, but my idea is that this shouldn't be dealt with at the font system level. Rather the font system should provide facilities like getting the set of glyphs which cannot be rendered in a given String, or perhaps getting the fonts which can display a given glyph. It's probably not worth starting to write code right now, as lots of things will certainly change with the new font system. But if you're interested you may have a look at the aXSL interface (www.axsl.org), which is the interface that the new font system will implement. You may look if the provided methods help solving this problem, what's missing, and where and how to implement font-selection-strategy, within Fop, on top of that. This would certainly ease the implementation once the migration is done. Vincent
Dear Vincent, some kind of "hasGlyph" function is most certainly necessary. And maybe some helper functions to go with it, but these can be easlily implementing using the "hasGlyph" the current font system has a "hasChar" function: public boolean hasChar(char c); I've added a tracker item for axsl http://sourceforge.net/tracker/index.php?func=detail&aid=1478049&group_id=123259&atid=695974 --- I've also found the source of my '#'. It is is o.a.f.render.pdf.PDFRenderer#escapeText: if (fs.hasChar(orgChar)) { ch = fs.mapChar(orgChar); int tls = (i < l - 1 ? parentArea.getTextLetterSpaceAdjust() : 0); glyphAdjust -= tls; } else { if (CharUtilities.isFixedWidthSpace(orgChar)) { //Fixed width space are rendered as spaces so copy/paste works in a reader ch = fs.mapChar(CharUtilities.SPACE); glyphAdjust = fs.getCharWidth(ch) - fs.getCharWidth(orgChar); } else { ch = fs.mapChar(orgChar); } } and by default fs.mapChar(orgChar) returns '#' if the char is not in that font. ---- so my "dirty hack" solution would be: - if glyph is not in font, go through list of all fonts until you find a font that has this glyph, use it instead. a good solution would be: - get a list of all fonts supporting that glyph. Find the one that is the "best match". use it. Max
(In reply to comment #5) Hi Max: > I've added a tracker item for axsl > http://sourceforge.net/tracker/index.php? func=detail&aid=1478049&group_id=123259&atid=695974 I am responding to the aXSL request here so that I can pick up the existing thread. The aXSL methods you are looking for are: FontUse boolean glyphAvailable(int codePoint) FontUse int unavailableChar(CharSequence chars, int beginIndex) FontUse int[] unavailableChars(CharSequence chars, int beginIndex) These methods are in FontUse instead of Font so that we can properly deal with Encoding issues, mostly for Type1 fonts. FontUse is the intersection of a Font, a FontConsumer, and an Encoding. FontUse instances are what get returned by the font-selection methods. For a glyph to be usable by your application, it must both 1) be available in the font, and 2) encodable by the font's encoding. > so my "dirty hack" solution would be: > - if glyph is not in font, go through list of all fonts until you find a font that has this glyph, use it > instead. > a good solution would be: > - get a list of all fonts supporting that glyph. Find the one that is the "best match". use it. This might be permissible under font-selection-strategy="auto", but I rather think would only be permissible as a fallback. What you probably really want is to implement the font-selection-strategy="character-by-character". The font- selection methods in aXSL require one codepoint to be passed, presumably the first codepoint that needs to be encoded. Then, using the methods noted above, your application needs to determine whether the remaining text can use the same font. If not, the font-selection method needs to be consulted again, this time passing the codepoint that is not served by the first font selected. IIRC, the last time I looked at FOP code, it took the first font-family in the list and used it for all text within scope, simply using a # glyph if the desired glyph was not available. I think Vincent is working on changing that, but I don't monitor the FOP lists, and don't know the status. When I implemented this in FOray, the hard part was not the algorithm, but finding the place to store its results. Either the FOTree or AreaTree has to know how to segment a chunk of text based on font selection. So, although the font system provides tools that are needed for the correct algorithm, it doesn't have any control over whether the correct algorithm is used. Since the XSL-FO "font-family" property is really a *list*, to ensure that you get a sigma character, you might say font-family="Base14-Helvetica, Base14- Symbol". Assuming the other font-selection criteria allow it, your sigma character would then be handled by the Base14-Symbol font. HTH. Victor Mote
I really think it is counterintuitive to have to add "symbol" to my list of favorite fonts. Unfortunately this what the spec says (xsl 1.1 / 7.9.2). So the proper font-selection strategy would solve the problem presented in my original file. Then I guess I'll just have to wait for that... Then I have to wishes for that. The first is that the "Symbol" and "ZapfDingbats" are part of the "default" font lists, such as the one when you do not specify a font and when you specify a generic font such as "serif" or "sans-serif". The second wish is a warning when an unsupported glyph is encountered. The PSRenderer does that currently. The next problem, however, is the inclusion of SVG graphics that contain glyphs. I'll attach a sample file that works fine in svg viewers, but not within a fop-pdf due to the same font issues. Or would this be a batik / xmlgraphics issue?
Created attachment 18200 [details] Sample SVG with a SIGMA character
Here is some additional info. When looking for the font-mechanism, I noticed a lot of duplicate code between o.a.f.svg.PDFGraphics2D#drawString and o.a.f.render.pdf.PDFRenderer#drawWord IMO a lot of this should go into a common place, ideally even into xmlgraphics: org.apache.xmlgraphics.pdf sounds like a very good place.... For the SVG graphics this seems suddently much easier, as there already is a "drawString" method which is able to switch fonts. So here's a possible solution for my SVG problem: - make the simpler drawString method call the attributed one. (shouldn't hurt). They have lots of duplicated code anyways. - implement automatic font switching in the attributed drawString method. (Simple for now, and maybe more sophisticated in the future). Have it in an external function so that it can be reused later. For the special characters in the text i'll wait for the new font system.
Created attachment 18205 [details] patch for PDFGraphics2D that enables rendering of special characters This patch implements the strategy from my last comment. The font-selection is very basic and just tries Symbol and ZapfDingbats (which are most likely to have the desired character) Using this patch enables my svg-formulas to be displayed correctly. For the in-fo-text characters I'll wait for the font redesign.
resetting P2 open bugs to P3 pending further review
increase priority for bugs with a patch