Issue 60594 - Kashidas painted into space between two Arabic words
Summary: Kashidas painted into space between two Arabic words
Status: CLOSED FIXED
Alias: None
Product: Writer
Classification: Application
Component: formatting (show other issues)
Version: OOo 2.0
Hardware: All All
: P3 Trivial (vote)
Target Milestone: ---
Assignee: stefan.baltzer
QA Contact: issues@sw
URL:
Keywords:
: 87476 94896 98194 (view as issue list)
Depends on:
Blocks: 79434 71804
  Show dependency tree
 
Reported: 2006-01-15 21:05 UTC by ahmed_aldesouky
Modified: 2013-08-07 14:43 UTC (History)
5 users (show)

See Also:
Issue Type: PATCH
Latest Confirmation in: ---
Developer Difficulty: ---


Attachments
Showing Arabic text problem when text is justified. (108.03 KB, image/png)
2007-05-16 22:38 UTC, okhayat
no flags Details
Spacing in Arabic Justified text (90.39 KB, image/png)
2007-10-08 07:29 UTC, okhayat
no flags Details
patch for file vcl\win\source\gdi\winlayout.cxx (4.87 KB, text/plain)
2008-04-01 11:53 UTC, hennerdrewes
no flags Details
patch draft for porlay.cxx (15.75 KB, text/plain)
2008-04-09 10:16 UTC, hennerdrewes
no flags Details
patches for files in vcl/inc (1.33 KB, text/plain)
2008-04-11 17:34 UTC, hennerdrewes
no flags Details
patches for files in vcl/source (1.76 KB, text/plain)
2008-04-11 17:34 UTC, hennerdrewes
no flags Details
patches for files in vcl/win (17.08 KB, text/plain)
2008-04-11 17:36 UTC, hennerdrewes
no flags Details
patch for file vcl\win\source\gdi\winlayout.cxx (on version 1.108.206.2) (5.71 KB, text/plain)
2008-04-11 17:38 UTC, hennerdrewes
no flags Details
patch for sw/core/source/inc/scriptinfo.hxx (1.79 KB, text/plain)
2008-04-11 17:46 UTC, hennerdrewes
no flags Details
patch for sw/source/core/text/porlay.cxx (18.19 KB, text/plain)
2008-04-11 17:47 UTC, hennerdrewes
no flags Details
new patch proposal (28.23 KB, text/plain)
2008-04-26 14:20 UTC, hennerdrewes
no flags Details
Screenshot of LTR paragraph: (128.32 KB, image/png)
2008-04-26 14:43 UTC, hennerdrewes
no flags Details
Screenshot of RTL paragraph: OK (134.74 KB, image/png)
2008-04-26 14:44 UTC, hennerdrewes
no flags Details
new patch for writer (33.92 KB, patch)
2008-05-31 13:54 UTC, hennerdrewes
no flags Details | Diff
patch for vcl reworked (13.20 KB, patch)
2008-06-13 10:46 UTC, hennerdrewes
no flags Details | Diff
patch correction (1.08 KB, patch)
2008-06-19 10:02 UTC, hennerdrewes
no flags Details | Diff
cleaned up patch for vcl (9.74 KB, patch)
2008-07-04 10:44 UTC, hennerdrewes
no flags Details | Diff
vcl.patch (task #3) (13.51 KB, patch)
2008-09-23 15:18 UTC, hennerdrewes
no flags Details | Diff
sw.patch (task #3) (30.37 KB, patch)
2008-09-23 15:19 UTC, hennerdrewes
no flags Details | Diff
new bugdoc (2.29 KB, application/vnd.oasis.opendocument.text)
2008-10-01 12:41 UTC, frank.meies
no flags Details
2nd try (10.04 KB, application/vnd.oasis.opendocument.text)
2008-10-01 13:23 UTC, frank.meies
no flags Details
screenshot (4.41 KB, image/png)
2008-10-01 13:24 UTC, frank.meies
no flags Details
Screenshot (2.06 KB, image/png)
2008-10-09 15:21 UTC, hennerdrewes
no flags Details
patch fntcache.cxx (858 bytes, patch)
2008-10-13 10:27 UTC, hennerdrewes
no flags Details | Diff
screenshot showing the problem for Western text with word underline (2.78 KB, image/png)
2008-10-13 10:31 UTC, hennerdrewes
no flags Details

Note You need to log in before you can comment on or make changes to this issue.
Description ahmed_aldesouky 2006-01-15 21:05:44 UTC
When use the "Writer" programe to write an Arabic paragraph, it works well. 
But when make this paragraph justified, some words split with noticed spacing.
Thanks
Comment 1 ahmed_aldesouky 2006-01-15 21:15:12 UTC
When useing "Writer" to write a Justified Arabic paragraph some words splits 
with a spaceing.
Thanks.
Comment 2 stefan.baltzer 2007-03-30 18:39:48 UTC
SBA: Please re-verify your findings with OOo 2.2 (out NOW...)
and comment here. If the problem still persists, please attach a bugdoc showing
this behavior.
Thank you.
Comment 3 petr.dudacek 2007-05-16 13:11:47 UTC
Hi ahmed_aldesouky,

any news on this issue?
Comment 4 okhayat 2007-05-16 22:36:10 UTC
This problem still exists. I'm testing on Debian SID (OOo 2.2). There are also
additional unnecessary Kashidas added. Check the screenshot.
Comment 5 okhayat 2007-05-16 22:38:16 UTC
Created attachment 45156 [details]
Showing Arabic text problem when text is justified.
Comment 6 frank.meies 2007-10-02 14:28:09 UTC
Just to clarify things: This issue is about the kashidas painted between two
Arabic words in justified alignment. There might be two reasons for this bug:

1. Normal Western justified alignment is applied to the text because the
language attribute of the CTL text is not set to an Arabic language. This is a
very common problem, since many users do not set the language attribute
correctly. We have issue 28203 which should make the justification mode (Westers
vc. Kashida)  independent from the language attribute by evaluating the unicode
code points.

2. A problem in the underlying layout engine. 

This issue is for 2. See issue 28203 for case 1.
Comment 7 okhayat 2007-10-08 07:28:59 UTC
The problem still exists with 2.3 though the rendering is a bit better now.
Comment 8 okhayat 2007-10-08 07:29:36 UTC
Created attachment 48739 [details]
Spacing in Arabic Justified text
Comment 9 frank.meies 2007-10-10 08:11:18 UTC
Adjusted title, see attached Screenshot.png. There is an other issue for gaps
inside Arabic words: issue 34139.
Comment 10 frank.meies 2007-11-05 08:58:39 UTC
.
Comment 11 hennerdrewes 2008-02-03 17:09:43 UTC
Please see my comments and patches for issue 77976.

For Windows: When you request to draw a justified Arabic word, but a glyph is
stretched by a value smaller than the minimal kashida width allowed by the font,
you get the gaps within the words.

I suggested to correct that behavior in winlayout.cxx, but hdu mentioned
problems involved with that. So I haven't come up with a patch for that.
Comment 12 hennerdrewes 2008-04-01 11:53:31 UTC
Created attachment 52432 [details]
patch for file vcl\win\source\gdi\winlayout.cxx
Comment 13 hennerdrewes 2008-04-01 12:09:49 UTC
I posted a patch for winlayout.cxx, which adjusts character positions in Arabic
text according to the iMinKashida value returned by ScriptGetFontProperties. The
text appears now without the gaps inside the words. 

Sideeffects: 
1. When applying color to single letters within a word, gaps may still appear. 
2. The impact on cursor placement is minimal, as characters are moved by very
few pixels
3. PDF export is entirely broken.
4. Of course, this is a Windows only solution.

I understand that it would be better to adjust the character placement right
away in writer, maybe in KashidaJustify. But I have to get the minimum kashida
width value over there from vcl. But I really didn't know how to do that (surely
not without messing up a lot of things).

I'd be happy to help out with a more permanent solution.

See also my patch in issue 87688 for the placing of kashidas.



Comment 14 frank.meies 2008-04-02 10:08:46 UTC
fme->hdu: Since this patch affects your playground, I suggest you take over.
Please have a look.
Comment 15 hdu@apache.org 2008-04-02 10:29:03 UTC
@hennerdrewes: thanks for the patch! Since the related VCL issue from issue 77976 is currently being 
integrated (probably into DEV300_m7, maybe even into m6) I'd like you to check your patch with this 
newer milestone. The file winlayout.cxx will probably look like in revision 1.108.206.2 then.
Comment 16 hennerdrewes 2008-04-02 11:06:09 UTC
@hdu: I can do that, but that'll take some time.

A more general question: 

- Do you want to integrate the patch in this or a similar form?
- Or should this check move to writer where KashidaJustify is applied in the
first place? The vcl should somewhere provide a MinKashidaWidth then.

Related to my remarks from yesterday: PDF export with kashida justification
hasn't worked before. This probably should go to a separate issue.
Comment 17 hdu@apache.org 2008-04-02 12:53:28 UTC
I'd prefer the second alternative to make the value for MinKashidaWidth available to the application layer 
that does the kashida justification and act on it in that layer.
Comment 18 hennerdrewes 2008-04-08 08:37:48 UTC
@hdu: some additional thoughts

I made some more experiments. I spare you the patches until they will be more
useful or you explicitly request to get them. But I think I have some important
hints to find a reasonable approach.

- It seems that - no matter if writer will respect a minimal kashida width in
its justification or not - the approach taken in my previous patch remains
valid. Probably due to rounding errors from logical units to device units and
due to a tendency of ScriptApplyLocigalWidths to be just off by one pixel in its
results, I still observed the gaps breaking the words. Less frequently though.

- There is another problem with vowel placement on a stretched letter. This also
seems to require some extra manipulation of the justified glyph widths. (not
implemented in the previous patch yet)

- I saw, that there is a GenericSalLayout::KashidaJustify method, which also
respects a minimal kashida width. Does this work? Or does this require
debugging? This issue was reported on all platforms! Anyhow, the minimal kashida
width used over there seems to be much smaller than the one returned by usp.

- When I moved the code to correct the character widths to writer, I discovered
that it's pretty useless to pad the blanks as I did in winlayout.cxx. As said
before, the rounding errors force me to do it again over there anyhow, and so
there is not much to gain. After looking at ScriptJustify once more, I switched
to reduce the number of kashida insertion points in the line, and recalculated
the space to add for each kashida. With this, the justification looks really
clean now. The fixing in winlayout.cxx has not a lot to do, except for some rare
cases (e.g. very short lines). But I encountered one problem: The cursor
placement is affected, and I have a "chicken and the egg" problem there. Cursor
placement relies on the correct number and identification of kashida insertion
points. But I manipulated there, so the cursor placement is off in some cases.
So there is some work to be done there, but I left that for the moment... 

Comment 19 hdu@apache.org 2008-04-08 09:53:20 UTC
Thanks for looking into this!

The kashida width currently used for GenericSalLayout::KashidaJustify() is just the logical width of an 
isolated glyph for the font's U+0640 (arabic tatweel). The idea is that it doesn't make sense to insert 
justification glyphs if they are too wide. Do you happen to know how the member iKashidaWidth from usp's FontProperties and this glyph width correlate?

I'm sure that GenericSalLayout::KashidaJustify() could benefit from more debugging. I'd delay this task 
though until we are happy with the justification result on one platform (USP?) and then port it to the 
other platforms. GenericSalLayout is mainly used on our Unix ports.

Writer does its cursor placement usually depending on the result of OutputDevice::GetTextArray() which 
is very related to UniscribeLayout::FillDXArray(). But when you override the resulting DXArray with your 
improved justification method then the cursor positioning needs to be adjusted too. FME can probably 
point you to the matching code in Writer.

The problem with vowel placement on a stretched letter is probably caused because a glyph cluster is 
not properly detected. Only glyph clusters themselves should get adjusted, the positioning inside a 
glyph cluster is supposed to be unchanged.

Hope this helps.
Comment 20 hennerdrewes 2008-04-08 12:58:04 UTC
The iMinKashida is in most of the fonts I checked 3/5th of the width of the
tatweel glyph (Arial, Times New Roman, Arabic Typesetting). In Tahoma the two
values are equal.

Wouldn't it make also sense to add the approach from
GenericSalLayout::KashidaJustify() to UniscribeLayout:GetNextGlyphs? Currently,
if I do a PDF export with Arabic justified text, we get huge gaps instead of the
kashidas. 
Comment 21 hennerdrewes 2008-04-09 10:13:38 UTC
@fme: I'll attach another version of the porlay.cxx patch. As discussed before
with hdu, I am searching for a way to modify writer's decision on kashida
positions that respects a minimal allowed kashida width. More details in the
previous comments. 

Can you look at the changes of the KashidaJustify method?
The patch works great for text display, but it disrupts cursor movement. When
called without the nSpaceAdd or nMinKashidaWidth parameters and no pKernArray is
provided, it returns the number of all possible kashida postions as before. But
if they are present, kashida positions are dropped and nSpaceAdd is adjusted
accordingly. 

I could catch, when running the process of cursor placement through the
debugger, that KashidaJustify is called with a nLen value just up to the present
cursor position. But in my version I needed to look a the whole text portion and
available space to determine, if a kashida position is valid or not.

I thought it might be a good idea, to mark kashida positions as valid or invalid
once, so that the correct information would available to the cursor context. But
for that I needed to know a lot more on the lifetime of the SwScriptInfo object.
Also important: when do I need to initialize and reset this information?

Any suggestions? 
Comment 22 hennerdrewes 2008-04-09 10:16:57 UTC
Created attachment 52673 [details]
patch draft for porlay.cxx
Comment 23 frank.meies 2008-04-09 12:35:07 UTC
fme->hennerdrewes: Thank you for your patch. I see. If called from the text
formatting, the code still returns the number of kashida positions inside the
given text range., because pKernArray = 0, nSpaceAdd = 0, nMinKashidaWidth = 0.
The second case - the function is called from fntcache.cxx for cursor
calculation of text painting - works as follows: First (like in the first case),
we calculate the number of kashidas in the text range. If we find, that the
space provided for the kashidas is smaller than a nMinKashidaWidth and there is
more than one kashida in the text range, we start discarding the kashida
positions at the beginning of the text range, as long as the nSpaceAdd value for
the remaining kashidas is < nMinKashidaWidth. Sounds reasonable.

I would suggest to proceed as follows:

In CalcNewBlock(), the nSpaceAdd value for each text line is stored at the
SwLineLayoutPortion. After this calculation has been performed, we know the
nSpaceAdd value for this line based on the assumption, that each possible
kashida position can be used. Here we need to insert some new code that iterates
over the text portions in the current line (SwTxtAttrIterator), checks which
font will be used for the portion, get the minimum kashida width for that font.
If the minimum kashida width is smaller than the calculated nSpaceAdd value
stored at the SwLineLayout, we should start to mark some of the kashida
positions in the SwScriptInfo structure as invalid, just as you proposed. Then
the nSpaceAdd value for storing at the SwLineLayout will be calculated again
until we can make sure that nSpaceAdd > the minimum kashida width for all the
text portions in the current line. If we proceed this way, I think we can keep
the original version of KashidaJustify(). One more note: Each time CalcNewBlock
is executed, we first have to reset the invalid flags for the kashida postions
located in the current line.

What do you think?

Comment 24 hennerdrewes 2008-04-09 14:23:06 UTC
@fme: sounds promising...

I'll give it a try.
Comment 25 hennerdrewes 2008-04-11 17:32:58 UTC
@fme: I had a look at the place at CalcNewBlock(), where the new code should go.
But I am afraid that I don't understand enough details on all the portion and
info objects involved. Would you find some time to write this iteration over the
text portions? 

I'll attach all (hopefully) useful changes I made so far, especially the changes
in vcl to enable querying the kashida width. 
I added to OutputDevice the following methods, the first one queries the current
selected font (analogous to GetFontMetric()).

long GetMinKashida() const;
long GetMinKashida( const Font& rFont ) const;

I added to SwScriptInfo the code necessary to mark kashida positions as invalid.
 KashidaJustify returned to the original format, but now skips the invalid
kashida positions.



Comment 26 hennerdrewes 2008-04-11 17:34:25 UTC
Created attachment 52754 [details]
patches for files in vcl/inc
Comment 27 hennerdrewes 2008-04-11 17:34:58 UTC
Created attachment 52755 [details]
patches for files in vcl/source
Comment 28 hennerdrewes 2008-04-11 17:36:27 UTC
Created attachment 52756 [details]
patches for files in vcl/win
Comment 29 hennerdrewes 2008-04-11 17:38:34 UTC
Created attachment 52757 [details]
patch for file vcl\win\source\gdi\winlayout.cxx (on version 1.108.206.2)
Comment 30 hennerdrewes 2008-04-11 17:46:23 UTC
Created attachment 52758 [details]
patch for sw/core/source/inc/scriptinfo.hxx
Comment 31 hennerdrewes 2008-04-11 17:47:35 UTC
Created attachment 52759 [details]
patch for sw/source/core/text/porlay.cxx
Comment 32 frank.meies 2008-04-12 09:53:48 UTC
fme->hennerdrewes: Thank you for you patch. I'll try to have a look at this and
the text iteration issue next week.
Comment 33 hennerdrewes 2008-04-26 14:20:59 UTC
Created attachment 53207 [details]
new patch proposal
Comment 34 hennerdrewes 2008-04-26 14:29:10 UTC
@fme: I think I finally figured it out. Will you have a look?

The justification works fine now. There is one new problem, though. If I write
an Arabic line, apply justification and LTR paragraph, the line is offset to the
right. The offset apparently correlates to the adjustment of the kashida
spacing. A text line without kashida positions marked as invalid is painted
correctly. 

In a RTL paragraph everything is fine though. I'll post a screenshot for
clarification.
Comment 35 hennerdrewes 2008-04-26 14:43:51 UTC
Created attachment 53208 [details]
Screenshot of LTR paragraph:
Comment 36 hennerdrewes 2008-04-26 14:44:46 UTC
Created attachment 53209 [details]
Screenshot of RTL paragraph: OK
Comment 37 frank.meies 2008-04-28 07:40:16 UTC
fme->hennerdrewes: Thank you for your patch. I'll have a look at it as soon as
possible, promised.
Comment 38 hennerdrewes 2008-04-30 09:58:23 UTC
@fme: sorry to bother you again...

I think I understand the remaining problem.

BidiPortion::CalcSpacing works on the stored number of blanks / kashidas, which
is set before we drop kashida positions. The nSpaceAdd parameter, on the other
hand, is the updated value after kashida positions are dropped.

TxtPortion::CalcSpacing recalculates its number of blanks on the fly, so the
dropped kashida positions are taken into account.

Any hints or suggestions?
Comment 39 frank.meies 2008-04-30 10:11:07 UTC
fme->hennerdrewes: No, not yet. I'll have to dig into this somewhat deeper, but
unfortunately I'm pretty busy right now. I'll have a look as soon as possible.
Comment 40 frank.meies 2008-05-28 09:29:34 UTC
fme->hennerdrewes: I had a look at the problem with the bidi portions. You are
right, for common text portions, CalcSpacing() as well as GetSpaceCnt() both
calculate the number of blanks on-the-fly whereas the SwBidiPortion stores the
number of blanks found during the formatting of the multi portion and makes use
of this value later on. Thinking this over, I cannot imagine a reason why this
has to be like this. I think the bidi portion can also calculate the number of
blanks on demand. So the code in pormulti.cxx that starts with the comment "//
Calculate number of blanks ..." can be moved to SwBidiPortion::GetSpaceCnt().
What do you think?
Comment 41 hennerdrewes 2008-05-28 11:20:22 UTC
@fme: In the mean time I did some experimenting with adjusting the values via
SwBidiPortion::SetSpaceCnt(). But if the Bidi Portion can calculate the number
of blanks on the fly, it would be certainly the more elegant solution. 

Anyhow, after solving this I expect the patches to work just fine.

Except for this case: If a font is applied, that substitutes certain character
combinations with glyph ligatures (like "Arabic Typesetting"), kashidas may be
inserted at illegal positions (and words will break again). To fix this, writer
would have to query for the actual behavior of the current font. I think this
could be done right before the MinKashidaWidth related dropping of kashida
positions. So there would be a second reason to drop kashida positions.

But I am not sure how to get to that font dependent information. It would
certainly require new interfaces in vcl to provide that information. (Maybe this
should move to a separate issue?)

I will post some updated version of the patches soon.


Comment 42 hennerdrewes 2008-05-31 13:54:34 UTC
Created attachment 54111 [details]
new patch for writer
Comment 43 hennerdrewes 2008-05-31 14:05:02 UTC
@fme: Added a new patch for writer.

Tried to implement everything as discussed. SwBidiPortion::GetSpaceCnt() takes
now a SwTextSizeInfo parameter, so I also had to make some changes, when this
function is called in other places. If you could check that again...

Anyhow, I haven't noticed any problems during my tests.

I also added some lines in fntcache.cxx to prevent displaced kashidas, when the
language attribute is not Arabic (issue 28203). But maybe the solution proposed
with issue 28203 (checking unicode character values and not the language
attribute) should be implemented now?
Comment 44 frank.meies 2008-06-02 06:08:23 UTC
@hennerdrewes: Sounds good. I'll have a look.
Comment 45 frank.meies 2008-06-12 08:45:53 UTC
fme->hennerdrewes: Thank you for your patch! Quite a lot of changed code ;-)
Since issue 87688 already has been integrated, I guess some of the code from the
patch is obsolete, is this correct?

[...] I also added some lines in fntcache.cxx to prevent displaced kashidas,
when the
language attribute is not Arabic (issue 28203). [...]

Please also remove these lines from the patch. This makes it easier to review
the patches.

fme->hdu: Would you like to set up a cws for this?
Comment 46 hennerdrewes 2008-06-12 13:00:57 UTC
@fme: I tested the changes on OOH680_m12 code, and the patch files also reflect
the changes to that version.

So the patch for porlay.cxx definitely contains code, which already has been
integrated. Just haven't had the time to transfer it to and test it with DEV300.
Can you manage with the patch as it is? 



Comment 47 hdu@apache.org 2008-06-12 13:16:08 UTC
> fme->hdu: Would you like to set up a cws for this?

I'd love to, but I'm completely under water...
Since the remaining patch is all in the SW module it would be a good idea to reassign it, ok?
Comment 48 frank.meies 2008-06-12 14:31:06 UTC
I'll create a cws ('kashidafix') based on DEV300m19 and commit the patches.
Comment 49 frank.meies 2008-06-13 08:10:12 UTC
fme->hennerdrewes/hdu: I found that some of the vcl_win.patch already made its
way into the current version (m19). Is this correct?
Comment 50 hennerdrewes 2008-06-13 08:27:00 UTC
@fme: Yes, that's correct.

On Apr 11 I also attached a separate patch for winlayout.cxx, which should
include only the recent changes related to this issue.
Comment 51 frank.meies 2008-06-13 08:31:46 UTC
fme->hennerdrewes: I tried to move the patch DEV300m19 but failed with the vcl
part. Please transfer the patch to DEV300m19, at least the vcl part the patch,
that would help a lot.
Comment 52 hennerdrewes 2008-06-13 09:05:08 UTC
@fme: OK
Comment 53 hennerdrewes 2008-06-13 10:46:06 UTC
Created attachment 54451 [details]
patch for vcl reworked
Comment 54 hennerdrewes 2008-06-13 10:46:56 UTC
@fme: hope this will be OK
Comment 55 hdu@apache.org 2008-06-13 13:21:44 UTC
@hennerdrewes: I just had a very quick look at the vcl patch
- please use the USP library loaded in winlayout.cxx, loading it in salgdi3 too is IMHO not needed
- eleven levels of indentation... maybe they are needed but they make the already complex LayoutText 
even more difficult
- only the Windows part of VCL is changed, I'm worried about the other ports: Do they need work too
Comment 56 hennerdrewes 2008-06-13 15:29:14 UTC
@hdu: I am aware that this version of the patch can be seen as a draft only.
There ary many things that should be cleaned up. 

For example: if ImplWinFontEntry was accessible in salgdi3.cxx, the ScriptCache
of ImplWinFontEntry could be used in the call to ScriptGetFontProperties.

Until now, I just tried to get things working, changing header files only when
absolutely necessary.

- For the other ports: 
They need work.
Basically they should provide a value for the minimal kashida width as done in
WinSalGraphics::GetFontMetric. If nothing is implemented though, calls to
OutputDevice::GetMinKashida will return 0. The new code in writer is executed
only if a positive GetMinKashida value is provided, so the patches should be
safe even if not fully implemented on all platforms. 

I have no idea to what extent changes are still needed in the layout process on
the other platforms. In UniscribeLayout::ApplyDXArray I had to add some new
stuff to correct results of ScriptApplyLogicalWidth (of course, this also needs
cleaning, and I might have some more suggestions for that..).








Comment 57 frank.meies 2008-06-14 07:06:12 UTC
I committed the patches to cws kashidafix, based on DEV300m19. Any further
patches should be relative to this cws. I did not modify the vcl patch but made
some minor changes to the sw patch.

fme->hennerdrewes/hdu: To judge from your last comments I guess that work on the
vcl part has not yet been finished, is this correct?

Comment 58 hennerdrewes 2008-06-19 10:02:23 UTC
Created attachment 54588 [details]
patch correction
Comment 59 hennerdrewes 2008-06-19 10:51:06 UTC
@fme: I made some corrections to your patch modifications.

@fme/hdu: I think it will be necessary to clearly define the scope of this
issue. Just after doing this, it will be possible to answer your questions, how
much further work is needed. 

The patches in their current state considerably improve the kashida
justifications on Windows. But here is a list of further things we should think
about:

1. General cleanup of vcl patches (as pointed out by hdu)
2. Support for other platforms.
3. Support for fonts with glyph ligatures (as mentioned in my comment from May 28)
4. Fix PDF export (Windows)

Additional comments
for 1: I will do that...

for 2: Basically, the other platforms need to provide a value for the minimal
kashida width. Most fonts I checked on windows use a value of 3/5th of the width
of the tatweel glyph. So maybe this could be used also on the other platforms.

But as hdu pointed out on April 8:
> I'm sure that GenericSalLayout::KashidaJustify() could benefit from more
> debugging. I'd delay this task though until we are happy with the 
> justification result on one platform (USP?) and then port it to the 
> other platforms. 

So there actually might be a lot more work left for the other platforms. 

But please take into account that I probably won't be able to help much for this
one.

for 3: This is my most urgent question: Extend the current patches to support
fonts, where certain character combinations are replaced by a single glyph?
Actually the new kashida dropping mechanism could be used for that, too. In
CalcNewBlock(), just before the checking of the minimal kashida width, we would
have to ask vcl, if each of our kashida positions fall on two separate glyphs or
if they are replaced by one glyph. In the latter case, we need to drop the
kashida position. 

for 4: PDF export on Windows needs manual insertion of tatweel glyphs.



Comment 60 hdu@apache.org 2008-06-19 12:29:30 UTC
I can do 1. 2. 3. and 4. but certainly not in the timeframe for OOo3.0 and probably not for 3.1. On the 
other hand it looks like the current status is quite an improvement and we could try to get this into 3.0 as 
it is and handle the other tasks as followup tasks with a different target milestone.
Comment 61 frank.meies 2008-06-25 11:42:15 UTC
fme->hdu/hennerdrewes: So let me sum this up. The current state of the patch
improves the kashida handling on Windows significantly, other platforms are not
affected at all. So besides from the fact that there's still some work left, is
there any good reason *not* to accept the patch? So I suggest that Henner
continues to work on task 1 (General cleanup of vcl patches) and after that
we'll try to integrate the patch for 3.1 (3.0 target is too close I think).

Comment 62 hdu@apache.org 2008-06-25 14:24:18 UTC
@fme: yes, that's exactly the way I see it (OOo 3.1)
Comment 63 hennerdrewes 2008-06-27 08:51:57 UTC
@hdu/fme: OK, understood. Hopefully will be posting something next week.
Comment 64 hennerdrewes 2008-07-04 10:44:18 UTC
Created attachment 54935 [details]
cleaned up patch for vcl
Comment 65 hennerdrewes 2008-07-04 10:50:34 UTC
@hdu: posted the reworked vcl patch. 

I moved the declaration of ImplWinFontEntry from winlayout.cxx to salgdi.h,
which helped to eliminate duplicate code. Hope this is OK.
Comment 66 Joost Andrae 2008-07-09 11:21:35 UTC
retarget issue to 3.1
Comment 67 hdu@apache.org 2008-07-15 09:41:37 UTC
@hennerdrewes: the new patch is much better, thank you! With the new patch the USP integration does no 
longer sprawl into other *.cxx files, just into the central salgdi.h. I don't remember why maScriptCache 
became a member of ImplWinLayout instead of UniscribeLayout, so I have no right to complain... but I'd 
rather have USP not go into salgdi.h. If there is a little "breathing" time between 3.0 and 3.1 I'll change it 
accordingly, else the patch can go in as it is.
Comment 68 frank.meies 2008-09-17 08:37:49 UTC
fme->hennerdrewes: Thanks again for your patch. I had a closer look at the
Writer stuff. Here are some comments:

1. The invalidation of kashidas has to be done per portion. Image this case: The
line starts with one huge Arabic character, the font of which reports a minimum
kashida width of 4000 twips. Then the kashida positions in the line are all
marked as invalid, although the portion which requires the 4000 twips does not
even have a kashida position.

2. I think using a sorted array (or a more 'modern' data structure like
std::set) is more suitable for the kashida array, since right now
IsKashidaInvalid/ClearKashidaInvalid always have to iterate the whole array.

3. if for some mysterious reason if(rSI.MarkKashidaInvalid(aItr.GetStart(),
aItr.GetLength()) fails, we have an infinite loop in itradj.cxx. 
Comment 69 hennerdrewes 2008-09-17 09:06:59 UTC
@fme: Thanks for your comments.

In the mean time, I have prepared some additional changes to support fonts with
glyph ligatures. I hope that I can post a patch later this week, and hopefully I
can include some of your hints. 

Other questions / comments: I have the feeling that since I made the changes to
SwBidiPortion::GetSpaceCnt(), there is a inaccuracy in the cursor placement,
when the cursor is placed on the boundary between LTR and RTL runs. Are you
really sure, there was no deeper intention behind the way GetSpaceCnt() was
implemented? 

I haven't found the time yet to revert the changes and see, if this is the
actual cause. But as Hebrew text is also affected, and all other changes
shouldn't affect Hebrew, I'd be surprised if the cause is elsewhere.


Comment 70 frank.meies 2008-09-17 16:25:21 UTC
@hennerdrewes:

[...] Other questions / comments: I have the feeling that since I made the
changes to
SwBidiPortion::GetSpaceCnt(), there is a inaccuracy in the cursor placement,
when the cursor is placed on the boundary between LTR and RTL runs. Are you
really sure, there was no deeper intention behind the way GetSpaceCnt() was
implemented? [...]

No, of course I'm not ;-) Have a look at the code following this comment in
itrcrsr.cxx:

// we came from inside the bidi portion, we want to blink...

There, we call SwBidiPortion::CalcSpacing() with rInf.nIdx already set to the
start of the next portion. Now that the number of blanks in the bidi portion is
calculated on-the-fly we have to make sure that rInf.nIdx is set to the start of
the bidi portion, just like with a regular SwTxtPortion::GetSpaceCntnt call. I
suggest to store the start of the bidi portion if we set pLastBidiPortion and
set this value temporarily at rInf before calling CalcSpacing().
Comment 71 frank.meies 2008-09-17 16:28:54 UTC
@hennerdrewes: Or better: Store with width of the bidi portion.
Comment 72 hennerdrewes 2008-09-20 17:57:53 UTC
@hdu: 

I just reviewed how my last patch was integrated. And I have one question on
that: In my version I managed to call ScriptGetFontProperties() with the same
SCRIPT_CACHE from ImplWinFontEntry. This was the reason for moving things to
salgdi.h. Are you still planning to clean this up, or should it be left as it is?

As I prepared some further changes, it would help to know from where to start
additional patches.

 
Comment 73 hennerdrewes 2008-09-21 20:53:52 UTC
It took some more glances until I understood that the last vcl.patch hasn't been
integrated. 

Anyhow, my last question is still relevant: how to prepare additional patches?
Comment 74 hdu@apache.org 2008-09-22 09:08:45 UTC
@hennerdrewes: the vcl.patch not being commited into CVS was an oversight; it is in now. I still intend to 
clean this up but cannot start it for at least the next two weeks. As for how to prepare the patch I'd 
suggest to do the diff against the tag CWS_DEV300_KASHIDAFIX.

What would be even better is to get you commiter rights. You have proven that you know what you are 
doing. If you are interested in that I'd support you getting commit rights. I'd wait for the SVN migration to 
settle before http://wiki.services.openoffice.org/wiki/OOo_and_Subversion#SSH_Setup is done.
Comment 75 hdu@apache.org 2008-09-22 09:39:22 UTC
correction to above: the tag is CWS_DEV300_KASHIDAFIX_ANCHOR
Comment 76 hennerdrewes 2008-09-23 15:18:56 UTC
Created attachment 56772 [details]
vcl.patch (task #3)
Comment 77 hennerdrewes 2008-09-23 15:19:46 UTC
Created attachment 56773 [details]
sw.patch (task #3)
Comment 78 hennerdrewes 2008-09-23 15:34:56 UTC
@fme, @hdu: just added new patches implementing the support for fonts with glyph
ligatures.

patches are relative to cws_dev300_kashidafix.

The sw.patch also includes the changes of issue 91465. The new code of this
issue relies on issue 91465.

@fme: I considered your comments #1 and #3. I haven't had the time to work on
your comment #2, although the new implementation will already save some
iterations, as kashidas are not marked necessarily one by one anymore.

Also fixed the cursor traveling.

I'll post some more comments and questions, but I need to go over the changes
once more. That might take a couple of days.


Comment 79 hennerdrewes 2008-09-25 13:52:48 UTC
@hdu: I added the methods KashidItemFix() and KashidaWordFix() to
UniscribeLayout. ApplyDxArray() looks a lot nicer, now that all the kashida
related post-processing has moved. But when I look at what I did in
KashidaWordFix(), I cannot believe that I had to write so many lines to fix the
gaps between the letters. Although the justification values requested by writer
are now much more appropriate, the values returned by ScriptApplyLogicalWidth
are very often just one or two pixels below the minimal kashida width, causing
the nasty gaps.

The new code seems to fix that quite well, but honestly, I don't like it.

But I had another idea: It should be possible to insert kashida glyphs manually.
If we were a little bit more tolerant regarding the minimal kashida width than
ScriptTextOut, we should be able to produce proper text output without this
width fixing. 

Anyhow, as far as I understood this is what is happening on linux and what we
have to add for PDF output on windows. So the manually inserted kashida glyphs
could fix the gap problem and the PDF output. But I guess we'll have to try and
check the visual results.
 
Comment 80 hennerdrewes 2008-09-25 14:39:44 UTC
@fme: In itradj.cxx the new code validates the kashida positions with the
current font. Only after that we'll check the resulting kashida width against
the minimal kashida width of the fonts. 

The new code also considers the possibility that all kashidas will be canceled.
This situation is very likely with some fonts, both because of glyph ligatures
and huge kashida glyphs. Regular blank justification should be done in that case. 

As the application of kashida justification still depends on the language
attribute, I had to check this attribute in additional places. But this could be
removed once issue 28203 is solved.

As we also have to prepare for blank justification for Arabic, I once more
suggest some changes to fntcache.cxx. The half spacing applied to Arabic
inevitably causes kashidas to appear between the words. 
Comment 81 frank.meies 2008-10-01 12:39:21 UTC
@hennerdrewes: Thank you for your patch. I noticed that there are still these
annoying _ artifacts in case a text line falls back to a common "blank
justification" even with the bNoHalfSpace change in fntcache.cxx. I'll attach a
document that shows the problem. Please have a look.
Comment 82 frank.meies 2008-10-01 12:41:10 UTC
Created attachment 56947 [details]
new bugdoc
Comment 83 hennerdrewes 2008-10-01 12:55:14 UTC
@fme: Can you upload the bugdoc once more? The file seems to be corrupted.

And can you also make a screenshot? I will check the file with my last build,
before the changes from cws swarab02 were introduced. I haven't built the last
version with the cws swarab02 changes yet. So until then, I can at least compare
my version to the screenshot.
Comment 84 frank.meies 2008-10-01 13:23:22 UTC
Created attachment 56950 [details]
2nd try
Comment 85 frank.meies 2008-10-01 13:24:18 UTC
Created attachment 56951 [details]
screenshot
Comment 86 frank.meies 2008-10-01 13:24:58 UTC
@hennerdrewes: Sorry, new bugdoc and screenshot attached.
Comment 87 hennerdrewes 2008-10-01 13:43:23 UTC
@fme: I also see the problem. So it is not connected to the swarab02 changes.
I'll dig into that a little later.
Comment 88 hennerdrewes 2008-10-09 15:17:11 UTC
@fme: 

Something seems to be wrong with the expanded blanks, if the blank is the last
character of the portion. The bNoHalfSpace actually doesn't matter in this case, as 

> fntcache.cxx line 1871
> if ( i + 1 == nCnt )
>     nSpaceSum += nSpaceAdd;
      
Strangely enough, in UniscribeLayout::ApplyDxArray, the mpCharWidths array shows
the character before the blank with the expanded width, and the blank with its
regular width. My crazy UniscribeLayout::KashidaItemFix() ensures that this
problem hasn't been left unnoticed, (: as it attempts to expand stretched glyphs
to the minimal kashida width. Without this the visual result of the problem
would occur less frequently, only when the added width exceeds the minimal
kashida width.
Comment 89 hennerdrewes 2008-10-09 15:21:34 UTC
Created attachment 57087 [details]
Screenshot
Comment 90 hennerdrewes 2008-10-09 15:28:36 UTC
Actually, if I look at this in Western text (see screenshot I've just attached),
it seems to be regular behaviour. When the blank is the last character of the
portion, the character before the blank seems to get expanded. The "blank-dot"
moves to the right.

While this is not problematic with Western text, it disrupts the blank
justification for Arabic.
Comment 91 hennerdrewes 2008-10-13 10:27:20 UTC
Created attachment 57136 [details]
patch fntcache.cxx
Comment 92 hennerdrewes 2008-10-13 10:29:08 UTC
@fme: uploaded a suggested patch for fntcache.cxx

Actually the problem equally affects word underline. See following screen shot.
Comment 93 hennerdrewes 2008-10-13 10:31:26 UTC
Created attachment 57137 [details]
screenshot showing the problem for Western text with word underline
Comment 94 hennerdrewes 2008-10-13 10:33:44 UTC
Comments to screen shot BlankAtPortionEndWordUnderline.png: The "t" of "Welt" is
stretched instead of the following blank.
Comment 95 eric.savary 2008-10-13 11:17:56 UTC
*** Issue 94896 has been marked as a duplicate of this issue. ***
Comment 96 frank.meies 2008-10-16 10:35:10 UTC
@hennerdrewes: Patch works fine, thank you.
@hdu/hennerdrewes: Anything left to do in vcl? Or are we done with this issue?
Comment 97 hennerdrewes 2008-10-17 09:21:11 UTC
@fme: I would like to go over the vcl stuff once more.

Other things that come to my mind: 

1. When I validate the kashida positions in itradj.cxx, I set the layout mode as
following:
> ULONG nOldLayout = rInf.GetOut()->GetLayoutMode();
> rInf.GetOut()->SetLayoutMode ( nOldLayout | TEXT_LAYOUT_BIDI_RTL );
> nKashidasDropped = rInf.GetOut()->ValidateKashidas ( ...);
> rInf.GetOut()->SetLayoutMode ( nOldLayout );

When wrote this, I wasn't really sure what is required. I found out that I need
to set the TEXT_LAYOUT_BIDI_RTL flag to get correct results for LTR paragraphs.
But since this has been done more by "trial and error" and not by knowing, I'd
be glad, if you give it a thought.

2. We haven't dealt with your comments on the data structures:
> I think using a sorted array (or a more 'modern' data structure 
> like std::set) is more suitable for the kashida array, since right 
> now IsKashidaInvalid/ClearKashidaInvalid always have to iterate the 
> whole array.

@hdu: Can you comment on my remarks from 25/9? 
Comment 98 hdu@apache.org 2008-10-17 15:47:23 UTC
> @hdu: Can you comment on my remarks from 25/9?

Inserting the kashida-glyphs into the PDF-export was already on my TODO list. If this also helps for 
simplifing code in Kashida*Fix it is all the better. I'm looking into it and hope to get it done in the next few 
days.
Comment 99 hennerdrewes 2008-10-17 20:42:14 UTC
@hdu: great.
Comment 100 frank.meies 2008-10-20 09:45:36 UTC
@hennerdrewes: I guess we can replace this code by setting

SwLayoutModeModifier aLayoutModeModifier( rInf.GetOut() );
aLayoutModeModifier.Modify( sal_True );

Since at this point in the code we are sure that the current text is Arabic
(which will be written from right to left), the SwLayoutModeModifer can be used
to pass this to the output device. The TEXT_LAYOUT_BIDI_STRONG flag will be set
(i.e., "don't check the direction, I know what I do) and the
TEXT_LAYOUT_BIDI_RTL flag will be set (i.e., "the next text is rtl text").

Please check, it the code works with the change.
Comment 101 ace_dent 2008-10-27 08:16:47 UTC
*** Issue 87476 has been marked as a duplicate of this issue. ***
Comment 102 hdu@apache.org 2008-10-29 11:51:16 UTC
> Inserting the kashida-glyphs into the PDF-export was already on my TODO list.

Done (also see issue 71804)

> If this also helps for simplifing code in Kashida*Fix it is all the better.

The above was solved differently, because that could possibly break other uses of the resulting SalLayout, e.g. with ScriptTextOut() etc.

@fme: from my POV the CWS code is ready...
Comment 103 hdu@apache.org 2008-11-26 11:42:23 UTC
Fixed for Windows platforms.
Comment 104 hdu@apache.org 2008-11-28 09:04:57 UTC
@sba: please verify on XP and Vista, please test for regressions on other platforms
Comment 105 stefan.baltzer 2008-12-10 09:05:13 UTC
Verified in CWS kashidafix.
Comment 106 eric.savary 2009-01-19 16:20:25 UTC
*** Issue 98194 has been marked as a duplicate of this issue. ***
Comment 107 stefan.baltzer 2009-04-27 12:47:12 UTC
OK in OOO310_m11. Closed.