Bug 41789

Summary: Text containing surrogate pairs painted as paths is wrong
Product: Batik - Now in Jira Reporter: Cameron McCormack <cam>
Component: GVT TextAssignee: Batik Developer's Mailing list <batik-dev>
Status: NEW ---    
Severity: normal    
Priority: P2    
Version: 1.7   
Target Milestone: ---   
Hardware: Other   
OS: other   
URL: http://arc.mcc.id.au/temp/2007/surrogate.svg

Description Cameron McCormack 2007-03-07 17:11:28 UTC
If text contains characters from outside the Basic Multilingual Plane, and the
text is rendered as paths (with text-rendering='geometricPrecision', or having
some rotation on the text), then the rendered glyphs are wrong.  In the example
at the URL given, the same text string is used twice.  The upper text element is
rendered by the Java2D classes and is done correctly.  The lower text element is
painted as shapes since it has text-rendering='geometricPrecision', but just
repeats the first glyph over the whole string (although the combining diereses
are rendered).

At least in AWTGVTGlyphVector.java:113, the call to glyphVector.getNumGlyphs()
returns a number that makes it look like java.awt.font.GlyphVector assumes each
of the two surrogates in a pair are separate glyphs.

I suspect there are a few places in the code that doesn't handle surrogate pairs
properly.
Comment 1 Cameron McCormack 2007-03-07 18:25:05 UTC
According to http://java.sun.com/developer/technicalArticles/Intl/Supplementary/
many of the JRE's classes that deal with characters have been updated in Java
1.5 to support surrogates.  Once 1.7 is shipped, we may want to consider making
Java 1.5 the baseline for support.
Comment 2 Cameron McCormack 2007-03-07 18:26:10 UTC
Oh and for the example, you may need the Code2001 font
(http://home.att.net/~jameskass/code2001.htm) for it to render.