Apache OpenOffice (AOO) Bugzilla – Issue 4587
The Chinese typesetting just doesn't work out
Last modified: 2003-09-08 16:56:16 UTC
With all Chinese (Traditional or Simple) versions of Open Office, simply the typesetting doesn't work out at all. On paragraph style, choose justified to a paragraph of Chinese mixed with English sentence you will see right away the typesetting doesn't work at all. Unless there are some further tunning serects I haven't known of from the program, otherwise it is defecct to be used for Chinese at all. Few points of my view to make here are: 1.When one clicks into a sentence containing some Chinese, the lauguage selector on the tool bar won't show the corresponding type face name like it does in English. 2.The typesetting is wrong mainly due to the fact that each Chinese character itself is a block of unit that doesn't require surronding spaces to signal a word as again like in English. So the basic algorithm will treat a string of Chinese characters as a very long unbreakable word and breaking the words incorrectly. 3.Each Chinese character has its own spacing built in along with the character itself inside the font set. One needs to use this information to display Chinese properly. Otherwise you will see a clustered mass of symbols meshed together. 4.The option of Asian typography in adding a space between Chinese and non- Chinese like number or English doesn't work out either. Thus, no space is seen to add into the sentence.
FME: Need more information: 1. What's a language selector? Do you mean the font selector? 2. Do you refer to the justified alignment? What's the intended behavior? In Western text we distribute the remaining space in a line equally to the blanks. So for Chinese, do we have to distribute the remaining space equally to all Chinese characters? 3. Are the Chinese characters displayed wrong? I need more information. 4. Some extra space is added between Asian and Western text portions. We do not add a blank to the string.
FME: I take this one.
> 1. What's a language selector? Do you mean the font selector? Yes, it is what I meant to be, the font selector. > 2. Do you refer to the justified alignment? What's the intended > behavior? In fact, it is about all alignments including left, center and justify. The problem is with the word breaking. There is no word breaking in Chinese. For example, in English to recoganize a word like say "WORD" in a sentence, the space before and after the "WORD" is to signal it is a block of unit, right? On the other hand for Chinese, a character is a block of unit itself and there won't be any intended spaces before or after it. Therefore, imagine that a sentence of Chinese will be treated as a very long word. Hence, the word breaking and line breaking will be wrong. >In Western text we distribute the remaining space in a line > equally to the blanks. So for Chinese, do we have to distribute the > remaining space equally to all Chinese characters? Yes, it should be so if there is any remaining space left in a line. Put aside this issue, the normal display of a Chinese character is more than just text out the font as in English. For the reason of the beauty in the typesetting, each character has it own built in space along with the character. This information is also varied with different font faces. It depends on the fonface maker's design. Thus, English text like this has the same white spaces after I press the spce bar. On the other hand, during normal typing of Chinese one doesn't need to press space bar for spacing but each character has its own little space built in to make sentecne readable to human's eye. Keep typing Chinese without even pressing any space bar is the normal behavior. Simply, each Chinese character is a rectangular block or a box. The space as in English is contained inside the box as a whole unit. > 3. Are the Chinese characters displayed wrong? I need more >information. The display of Chinese characters is correct but it doesn't typeset correctly in the sense as mentioned above. > 4. Some extra space is added between Asian and Western text >portions. I have seen no extra space is added.
FYI that the StarOffice or StarSuite from Sun doesn't work properly either on this issue. On the other hand, the RedOffice ( a trial version can be downloaded from http://www.ch2000.com.cn/download/cpdownload/RedOffice-0.9.7- 5.limittime.i386.rpm) works fine just like MS Office Chinese edition.
Here is another conventional way in dealing with Chinese fonts FYI. Text assumption: a mixed of Chinese and English text Issue: How to change the Chinese and English fonts to a user requested fonts chosen from the font selector? Actions:1. Highlite the paragraph of text a user would like to change the fonts (multi-fonts). 2. First select the Chinese font to settle the desired Chinese font, for example mingLiu. 3. Next select the English font to settle the desired English font, for example Georgia. Result: A user is successful to change multi-fonts within a paragraph of mixed text. For example, a line of text with Chinese display using mingLiu and English text using Georgia. Remark: Do you see the step 2 and 3 cannot be altered to get the same result? The reason is that for any Chinese font set it also contains a part of fonts with ASCII format. Thus, a Chinese font set is enough to typset both English text and the Chinese text alone. For Open Office: The above doesn't work out at all.
I have come up a good illustration to the issue. In the following, each x is to stand for a Chinese character. Try to format the following example in your Open Office with each kind of alignment. Example: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxx, e-mail is the essential tool xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx With nearly 70% xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxx importance of e-mail communication servers for xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.
Ok, now I understand the problem. I'll try to sum up and give some answers: 1. Word breaking does not work because only blanks are recognized as possible break positions: The word break algorithm used in StarOffice is different from the one used in OpenOffice.org. The current algorithm in OpenOffice.org simply searches for blanks in the string, whereas the algorithm in StarOffice is much more sophisticated (locale dependent). This algorithm could not be opensourced for legal reasons, but this problem is going to be solved in one of the next versions. 2. The font selector does not show the Asian font when the cursor is in an Asian part of the text. We have different fonts for different script types in the Writer. If you have a look at the Format - Character dialog you can set two fonts, one for western and one for Asian text (this requires to have the Asian language support enabled in Options - LanguageSettings). Unfortunately this does not work for OpenOffice.org, only for StarOffice, because in OpenOffice.org there is currently no functionality to distinguish different script types. The reason for this is the same as the one for 1. This is going to be solved as well. 3. The extra space between different script types does not work in OpenOffice.org because (as already mentioned in 2.) there is no functionality to distinguish different script types. Will be solved.
*** Issue 4785 has been marked as a duplicate of this issue. ***
-
*** Issue 5793 has been marked as a duplicate of this issue. ***
*** Issue 7600 has been marked as a duplicate of this issue. ***
It is quite a different picture on my machine. I ran OOo on Windows Me (Taiwan version), and got the following results. (1) I load a word file (.doc) with Chinese characters and all Chinese characters are displayed as squares. By my knowledge, these squares are used to denote characters that has no corresponded entry in a font file, in the MS Windows system. (2) The fonts for English and for Chinese can be set seperately, so I pick some Chinese font as the font for Chinese characters and nothing happened. (3) I pick another Chinese font as the font for English characters and now all Chinese and English characters are displayed normally, using the font designated to English characters. (If I pick some English font for English characters, of course, Chinese characters still can not be displayed.) I thought OOo was able to handle Chinese characters in the kernel but failed to distinguish between Chinese character codes and English ones when applying fonts, so English font was applied to all characters, regardless whether they are Chinese or English ones. CW Cheng is using Windows 2000 and that system is based on unicode, while I am using Windows Me which is based on ASCII (for English) and Big5 (for Chinese). I don't know whether it makes the differences. BTW, at this point of view, the word-wrap behavior problem should be different from this one. But if OOo failed to recognize Chinese character codes, treating each Chinese character as a single word would be impossible. This is a prerequisite.
FME: Works in 643.
.
FME->SBA: Fixed.
Closed.
This task is fixed or worked in OOo 1.1 beta2.
closed ...