Issue 4587 - The Chinese typesetting just doesn't work out
Summary: The Chinese typesetting just doesn't work out
Status: CLOSED FIXED
Alias: None
Product: Writer
Classification: Application
Component: code (show other issues)
Version: OOo 1.0.0
Hardware: PC Windows 2000
: P3 Trivial (vote)
Target Milestone: ---
Assignee: stefan.baltzer
QA Contact: issues@sw
URL:
Keywords:
: 4785 5793 7600 (view as issue list)
Depends on:
Blocks:
 
Reported: 2002-05-07 11:02 UTC by Unknown
Modified: 2003-09-08 16:56 UTC (History)
1 user (show)

See Also:
Issue Type: DEFECT
Latest Confirmation in: ---
Developer Difficulty: ---


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description Unknown 2002-05-07 11:02:07 UTC
With all Chinese (Traditional or Simple) versions of Open Office, simply the 
typesetting doesn't work out at all.

On paragraph style, choose justified to a paragraph of Chinese mixed with 
English sentence you will see right away the typesetting doesn't work at all. 
Unless there are some further tunning serects I haven't known of from the 
program, otherwise it is defecct to be used for Chinese at all.

Few points of my view to make here are:
1.When one clicks into a sentence containing some Chinese, the lauguage 
selector on the tool bar won't show the corresponding type face name like it 
does in English.
2.The typesetting is wrong mainly due to the fact that each Chinese character 
itself is a block of unit that doesn't require surronding spaces to signal a 
word as again like in English. So the basic algorithm will treat a string of 
Chinese characters as a very long unbreakable word and breaking the words 
incorrectly.
3.Each Chinese character has its own spacing built in along with the character 
itself inside the font set. One needs to use this information to display 
Chinese properly. Otherwise you will see a clustered mass of symbols meshed 
together.
4.The option of Asian typography in adding a space between Chinese and non-
Chinese like number or English doesn't work out either. Thus, no space is seen 
to add into the sentence.
Comment 1 frank.meies 2002-05-07 11:39:09 UTC
FME: Need more information:

1. What's a language selector? Do you mean the font selector?
2. Do you refer to the justified alignment? What's the intended
behavior?  In Western text we distribute the remaining space in a line
equally to the blanks. So for Chinese, do we have to distribute the
remaining space equally to all Chinese characters?
3. Are the Chinese characters displayed wrong? I need more information.
4. Some extra space is added between Asian and Western text portions.
We do not add a blank to the string.
Comment 2 frank.meies 2002-05-07 12:57:06 UTC
FME: I take this one.
Comment 3 Unknown 2002-05-08 04:35:05 UTC
> 1. What's a language selector? Do you mean the font selector?
Yes, it is what I meant to be, the font selector.

> 2. Do you refer to the justified alignment? What's the intended
> behavior?
In fact, it is about all alignments including left, center and 
justify. The problem is with the word breaking. There is no word 
breaking in Chinese. For example, in English to recoganize a word 
like say "WORD" in a sentence, the space before and after the "WORD" 
is to signal it is a block of unit, right? On the other hand for 
Chinese, a character is a block of unit itself and there won't be any 
intended spaces before or after it. Therefore, imagine that a
sentence of Chinese will be treated as a very long word. Hence, the 
word breaking and line breaking will be wrong.

>In Western text we distribute the remaining space in a line
> equally to the blanks. So for Chinese, do we have to distribute the
> remaining space equally to all Chinese characters?
Yes, it should be so if there is any remaining space left in a line. 
Put aside this issue, the normal display of a Chinese character is 
more than just text out the font as in English. For the reason of the 
beauty in the typesetting, each character has it own built in space 
along with the character. This information is also varied with 
different font faces. It depends on the fonface maker's design. Thus, 
English text like this has the same white spaces after I press the 
spce bar. On the other hand, during normal typing of Chinese one 
doesn't need to press space bar for spacing but each character has 
its own little space built in to make sentecne readable to human's 
eye. Keep typing Chinese without even pressing any space bar is
the normal behavior.

Simply, each Chinese character is a rectangular block or a box. The 
space as in English is contained inside the box as a whole unit.

> 3. Are the Chinese characters displayed wrong? I need more 
>information.
The display of Chinese characters is correct but it doesn't typeset 
correctly in the sense as mentioned above.

> 4. Some extra space is added between Asian and Western text 
>portions.
I have seen no extra space is added.
Comment 4 Unknown 2002-05-08 04:37:45 UTC
FYI that the StarOffice or StarSuite from Sun doesn't work properly 
either on this issue. On the other hand, the RedOffice ( a trial 
version can be downloaded from 
http://www.ch2000.com.cn/download/cpdownload/RedOffice-0.9.7-
5.limittime.i386.rpm) works fine just like MS Office Chinese edition.
Comment 5 Unknown 2002-05-08 04:42:29 UTC
Here is another conventional way in dealing with Chinese fonts FYI.

Text assumption: a mixed of Chinese and English text
Issue: How to change the Chinese and English fonts to a user  
requested fonts chosen from the font selector?
Actions:1. Highlite the paragraph of text a user would like to change 
the fonts (multi-fonts).
        2. First select the Chinese font to settle the desired Chinese
font, for example mingLiu.
        3. Next select the English font to settle the desired English
font, for example Georgia.

Result: A user is successful to change multi-fonts within a paragraph 
of mixed text. For example, a line of text with Chinese display using 
mingLiu and English text using Georgia.
Remark: Do you see the step 2 and 3 cannot be altered to get the same
result? The reason is that for any Chinese font set it also contains 
a part of fonts with ASCII format. Thus, a Chinese font set is enough 
to typset both English text and the Chinese text alone.

For Open Office: The above doesn't work out at all.
Comment 6 Unknown 2002-05-08 04:54:52 UTC
I have come up a good illustration to the issue. In the following, 
each x is to stand for a Chinese character. Try to format the 
following example in your Open Office with each kind of alignment.

Example:

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxx, e-mail is the essential tool 
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 
With nearly 70% 
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxx importance of e-mail communication servers 
for xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx. 
Comment 7 frank.meies 2002-05-08 08:26:05 UTC
Ok, now I understand the problem. I'll try to sum up and give some
answers:

1. Word breaking does not work because only blanks are recognized as
possible break positions:

The word break algorithm used in StarOffice is different from the one
used in OpenOffice.org. The current algorithm in OpenOffice.org simply
searches for blanks in the string, whereas the algorithm in StarOffice
is much more sophisticated (locale dependent). This algorithm could
not be opensourced for legal reasons, but this problem is going to be
solved in one of the next versions. 

2. The font selector does not show the Asian font when the cursor is
in an Asian part of the text.

We have different fonts for different script types in the Writer. If
you have a look at the Format - Character dialog you can set two
fonts, one for western and one for Asian text (this requires to have
the Asian language support enabled in Options - LanguageSettings).
Unfortunately this does not work for OpenOffice.org, only for
StarOffice, because in OpenOffice.org there is currently no
functionality to distinguish different script types. The reason for
this is the same as the one for 1. This is going to be solved as well.

3. The extra space between different script types does not work in
OpenOffice.org because (as already mentioned in 2.) there is no
functionality to distinguish different script types. Will be solved.
Comment 8 frank.meies 2002-05-13 09:13:19 UTC
*** Issue 4785 has been marked as a duplicate of this issue. ***
Comment 9 frank.meies 2002-05-27 08:04:58 UTC
-
Comment 10 frank.meies 2002-06-13 11:48:56 UTC
*** Issue 5793 has been marked as a duplicate of this issue. ***
Comment 11 prgmgr 2002-09-14 23:50:22 UTC
*** Issue 7600 has been marked as a duplicate of this issue. ***
Comment 12 Unknown 2002-10-07 10:30:37 UTC
It is quite a different picture on my machine. I ran OOo on Windows Me
(Taiwan version), and got the following results.

(1) I load a word file (.doc) with Chinese characters and all Chinese
characters are displayed as squares. By my knowledge, these squares
are used to denote characters that has no corresponded entry in a font
file, in the MS Windows system.

(2) The fonts for English and for Chinese can be set seperately, so I
pick some Chinese font as the font for Chinese characters and nothing
happened.

(3) I pick another Chinese font as the font for English characters and
now all Chinese and English characters are displayed normally, using
the font designated to English characters. (If I pick some English
font for English characters, of course, Chinese characters still can
not be displayed.)

I thought OOo was able to handle Chinese characters in the kernel but
failed to distinguish between Chinese character codes and English ones
when applying fonts, so English font was applied to all characters,
regardless whether they are Chinese or English ones.

CW Cheng is using Windows 2000 and that system is based on unicode,
while I am using Windows Me which is based on ASCII (for English) and
Big5 (for Chinese). I don't know whether it makes the differences.

BTW, at this point of view, the word-wrap behavior problem should be
different from this one. But if OOo failed to recognize Chinese
character codes, treating each Chinese character as a single word
would be impossible. This is a prerequisite.

Comment 13 frank.meies 2002-10-21 16:29:04 UTC
FME: Works in 643.
Comment 14 frank.meies 2002-12-06 08:24:39 UTC
.
Comment 15 frank.meies 2002-12-06 08:56:43 UTC
FME->SBA: Fixed.
Comment 16 stefan.baltzer 2002-12-10 16:05:50 UTC
Closed.
Comment 17 thorsten.ziehm 2003-05-20 16:15:32 UTC
This task is fixed or worked in OOo 1.1 beta2.
Comment 18 thorsten.ziehm 2003-05-20 16:32:11 UTC
closed ...