Apache OpenOffice (AOO) Bugzilla – Issue 91226
Traditional mongolian support
Last modified: 2017-05-20 11:41:47 UTC
Please support traditional mongolian writing direction which is written from top to down and from left to right. Here is the patch. If you have any questions, don't hesitate and contact me.
Created attachment 54849 [details] Traditional mongolian support
I upgrade priority to P2 due to Pavels msg. http://l10n.openoffice.org/servlets/ReadMsg?list=dev&msgNo=9545 Is it possible to integrate it to SRC680 tree until 20 of July 2008?
ama->badaa: Thank you for your patch. We will not be able to integrate it until July, 20th because we are right now in a "stabilization mode" for OOo3.0. But when this is done and the code line for OOo3.1 is opened (July/August I think) we will integrate your patch ASAP.
I applied the patch in CWS mongolianlayout. It will take some time to review and perhaps improve the patch.
Does this patch solve http://qa.openoffice.org/issues/show_bug.cgi?id=3692 - vertical text in table cells? Is it good enough to be integrated in 3.1?
Oops, Today I became a bug via mn.openoffice.org. I forgot to add mongolian to the CTL menu. mba can you please complete it? Which procedures are related with this entry?
Do you mean the menu in "Tools-Options-LanguageSettings-CTL"?
Exactly!
Again, the entry name should not be "Mongolian" but "Mongolian script"
I checked all relevant files; it seems that everything is there except a String that shall be displayed in the list box. The file for this is svtools/source/misc/langtab.src. Currently it contains < "Mongolian" ; LANGUAGE_MONGOLIAN ; > ; AFAIK this is the cyrillic variant, so perhaps we should name that "Mongolian(cyrillic)" and add the missing line, so that we have < "Mongolian (cyrillic)" ; LANGUAGE_MONGOLIAN ; > ; < "Mongolian" ; LANGUAGE_MONGOLIAN_MONGOLIAN ; > ; This should be enough to see the string in the list box. Of course if you want to stick with "Mongolian" for the cyrillic variant you can choose a suitable string for LANGUAGE_MONGOLIAN_MONGOLIAN as you see fit.
I just learned something new, and I'm afraid that I have some bad news. What I didn't know (and nobody from the team knew): in OOo we can only support *either* Mongolian *or* Mongolian(cyrillic). The reason is that we don't have real script support and the script type is derived from the "language attribute" (should be "locale attribute") of the text. Implementing script type support in OOo is a lot of work to do and will most probably not happen before the OpenJava guys have defined how *they* will fix this problem (they also have it). We want to stay "compatible" with them as our UNO struct we use for transporting the relevant information is directly mapped to the Java type in our Java-UNO binding. There is a possible workaround. It will enable users to use either script type in their documents, but not both Mongolian and Mongolian(cyrillic) in the same document. Before I go into the details - what do you think? The only alternative I see is throwing our Mongolian(cyrillic) but this will damage older documents written with that language and script type.
Please define like that: < "Mongolian (cyrillic)" ; LANGUAGE_MONGOLIAN ; > ; < "Mongolian (traditional)" ; LANGUAGE_MONGOLIAN_MONGOLIAN ; > ; >There is a possible workaround. It will enable users to use either script type >in their documents, but not both Mongolian and Mongolian(cyrillic) in the same >document. Most probably, mongolian users want to use just mixed. What do you mean with "script type"? (is it cyrillic and traditional?) >The only alternative I see is throwing our Mongolian(cyrillic) but this will >damage older documents written with that language and script type. How is it for new documents? I think, really few users have created documents with Mongolian (cyrillic). Could users use both script types?
> Please define like that: > < "Mongolian (cyrillic)" ; LANGUAGE_MONGOLIAN ; > ; > < "Mongolian (traditional)" ; LANGUAGE_MONGOLIAN_MONGOLIAN ; > ; Sorry, that was before I recognized the problem. We can have only one of them at the moment anyway, so we can stay with "Mongolian". >> There is a possible workaround. It will enable users to use either script >> type in their documents, but not both Mongolian and Mongolian(cyrillic) in >> the same document. > > Most probably, mongolian users want to use just mixed. What do you mean with > "script type"? (is it cyrillic and traditional?) Using mixed is not possible in the current OOo version as we don't have a "script type" attribute for text, only "language" and OOo uses a fixed assignment of language to script type. And "script type" in OOo currently only can have three values: "asian", "CTL" and "Western", and the latter is just everything that does not fit into the other categories. So "Western" also contains all cyrillic variants also. Currently "Mongolian" is assigned to "Western". We can keep it that way (plan A) or we can move it to "CTL" (plan B). With some creative coding we can at least make this assignment on a per-document basis, but inside a particular document we have to assign "Mongolian" to either script type (plan C). >> The only alternative I see is throwing our Mongolian(cyrillic) but this will >> damage older documents written with that language and script type. > > How is it for new documents? I think, really few users have created documents > with Mongolian (cyrillic). Could users use both script types? My suggestion here would be plan B: assign "Mongolian" to "CTL". If users had created a document with mongolian text, this text now magically would become CTL text. What this would mean exactly for the way how OOo treats this document has to be figured out. Most probably the spell checking won't work for mongolian text with cyrillic layout. In case of new documents they would automatically have the right language assignment to the "CTL" section, but in older versions of OOo they would have the same problem as the cyrillic ones in future versions. BTW: only with Plan A we won't have this problem! Plan A would mean: keep it like it is. Consequence is that most probably spell checking won't work with mongolian documents in traditional layout. With plan C (allow to assign "Mongolian" to either script type group) we could avoid problems at least for future OOo versions if the whole document either uses cyrillic or traditional layout. "Plan D" would be implementing script type support in OOo. AFAIK this is possible since ODF 1.2 and so basically doable. But the effort is high and definitely not doable for OOo3.1. Does that answer your question so that you can tell me if we should go for plan A, B or C? While "A" and "B" are done fast and easily, "C" would require some additional coding.
The 2 variants of Mongolian are effectively different languages - the words in traditional Mongolian and Cyrillic Mongolian have completely different spellings and the rendering rules for the 2 variants are also completely different (Mongolian Cyrillic follows basically the same rules as other Cyrillic scripts whereas traditional Mongolian is more like Arabic, with the glyphs of the letters changing form according to their position in the word, etc.). So in my view the best solution is to treat Mongolian as 2 languages (plan E): language 1) traditional Mongolian; and language 2) Mongolian Cyrillic. Then Mongolian Cyrillic can be included in Western along with the other Cyrillic variants; and traditional Mongolian can be included in CTL, alongside Arabic. This also solves the problem of mixing the 2 variants of Mongolian in the same document (assuming you can embed Western in CTL and vice versa), and the problem of older documents written in Cyrillic. And since plan E is basically a combination of A and B, presumably it should also be fast and easy? (BTW I don't think plan D works - it's not possible to define a transliteration between the traditional and Cyrillic scripts; the words are actually different, and you can't just change from one to the other by just changing the script.)
Sorry, but I already explained why this is not possible at the moment: we can differentiate between languages only if they have different ISO locales. This is unfortunately not the case for Mongolian cyrillic/traditional. And you seem to misunderstand "plan D". Plan D is to enable OOo for script type support. This would enable it support both Mongolian variants, even if they share the ISO locale. Everything else than plan D is a hack.
Some further explanation: To treat text correctly, OOo must know which (ISO) language and which script type is has. As OOo does not have script type support in the file format at the moment, we have a fixed assignment language<->script type and this works in most cases. We fail in all cases where an ISO locale is used with different script types, like Mongolian or Serbian. So the only correct and future-proof fix would be extending OOo's text attributes to have language *and* script type. ODF 1.2 allows us to do so, but the implementation will cost a lot of time. So my plans A-C describe how we should treat Mongolian in between: either keep it "cyrillic always", "traditional always" or "either cyrillic or traditional on a per document base". BTW: I don't understand how you see conversions involved in my "plan D". But I hope that now this plan becomes more clear.
It seems for me Plan D is most intelligent. But we don't have to fly in the universe, so I would prefer Plan C to support two mongolian variants in OOo 3.1. Because, in Mongolia used cyrillic script more common (also official script) and as I want also to support mongolian traditional script in OOo. I hope that, Plan D must come on some day in future. Is it really not possible to assign Mongolian (traditional) to mn_TR? I have planned to implement traditional mongolian locale with this code and it has been added to OOo build system. ( issue 88665 )
No, mn-TR is no valid ISO locale. And this brings us to another problem that can make even "Plan C" impossible. There is i18n-based functionality like our "break iterator" that is though to do something with text, based on its "language" attribute, that we transport as UNO struct com.sun.star.lang.Locale. This struct does not allow to transport the script type and so e.g. the break iterator is not able to distinguish between Mongolian text in cyrillic and traditional layout. As the i18n code and even more the ICU that it uses for many purposes doesn't know anything about configuration settings, I currently see no way to consider additional information to make the wanted differentiation inside our i18n library. So at the end we can have linguistic and other i18n support only for either the cyrillic (Plan A) or the traditional (Plan B) variant. cc'ing Eike, our expert for i18n stuff.
As the css::lang::Locale struct corresponds to the java.util.Locale this essentially boils down to how Java will handle script codes in future. The current Language/Country/Variant fields were modeled after IETF RFC 1766. That has been superseded by IETF BCP47 (RFC 4646 and 4646bis and related) language tags. There was a "Java Locale Enhancement Project" initiated at OpenJDK, see http://mail.openjdk.java.net/pipermail/discuss/2008-October/001341.html That mail in the quoted text at the bottom also includes valuable links. Further references: http://mail.openjdk.java.net/pipermail/announce/2008-November/000063.html http://openjdk.java.net/projects/locale-enhancement/ I'm eagerly awaiting the outcome. However, we cannot support different script types for the same language as long as that issue isn't solved. After that, conversion between MS-LangID and lang::Locale will have to be adapted, handling inside i18npool to pass it down to ICU, and quite some effort will have to go into storage of fo:script elements in ODF (v1.2 will have that) respectively a RFC4646 fall-back where the combination of fo:language, fo:script and fo:country is not sufficient. I think this issue here should be retargeted to OOo3.x, doing so. Thank you for being patient ;) Eike
OK, let me explain briefly our current status and why we'd like to find a solution of some form in the short term rather than the longer term. We have made a spell checker for the cyrillic form of Mongolian, and we've also developed support for the traditional Mongolian script, including an Open Type font with most of the rendering rules for determining the correct forms of the letters in the words. These are currently built on top of OOo2.4, and, as far as we are aware, there is no usable version of either of these things except ours. We would therefore like to make both of these "officially" available as soon as possible (i.e. before someone else implements the same things on another platform ;-) ) to encourage as many Mongolian users as possible to use OOo. If I understand the discussion we've had so far correctly (and please correct me if I don't), the situation currently is as follows: 1) there is a fixed list of "valid" languages, and we can only use languages from that list. 2) each of the languages in the list has a fixed language type - Western, CTL, or Asian. 3) we can have country variants of a language (e.g. mn-MN, etc.), but all of these must have the same language type as the basic language. 4) if/when the "script" attribute of a language is implemented, it will be possible for different scripts to have different language types. Assuming all this is correct, the following 2 possibilities occur to me: 1) Is there some sort of "dummy"/"wildcard"/"other" language defined in the list of languages? If so, we could temporarily use "other"-mn for the second version of Mongolian. (Yes, I know this is a hack, but at least it would allow us to include both forms of Mongolian now and I guess a lot of the work would be reusable after the script attribute is implemented.) 2) Is it possible to make a branch in the development, and have one form of Mongolian included in the official OOo3.1 and the other form included in an unofficial version? (I'm thinking of branch points as in the CVS version control system.) We could then develop the two scripts independently on the two branches (i.e. one branch follows plan A and one branch follows plan B), and then merge the branches when the script attribute is implemented. This is of course not ideal for Mongolian users who want to use both scripts because they need both versions of the system, but it's still better than only having support for one script.
Without going into details, IMHO we can have both Mongolian variants in OOo, we just can have linguistic support only for one of them. Currently "Mongolian" is a "Western" language (cyrillic script). You can add the spell checker for that language to OOo and it will work . You can also add your traditional Mongolian support. In fact I already have committed the attached patch into a CWS) and it should work (as good as in your own builds). Users just can't set the language of the text to "Mongolian" and so no spell checking is possible for text written with "Traditional Mongolian". This will require the changes in OOo Eike and I tried to explain. So the request of badaa in #desc7 can't be fulfilled in OOo 3.1.
Note: I made this issue depend on issue 88665, as that is about adding the ISO mapping and language list box entries. Further discussion about the script type problematics should be carried over there to not clutter this issue here. @erdee: Forget about the "language type" (Western, CTL, CJK), that is only some arbitrarily (and IMHO badly) constructed classification introduced by MS, in fact it should be called "CTL, CJK, Other" instead. The problem is with the script types, and that currently multiple script types for one language are not representable in the API, and storing them in the file format is also not implemented yet. To your points above: 1. No, there are no user definable language codes reserved in ISO 639. 2. A branch of OOo: you don't really want that. You especially don't want it to maintain. If you personally want to do that, including rebasing the branch to new master milestones to catch up with development on trunk, do the release handling, QA and everything else that is needed, then of course it would be a possibility.
Well, just to be exact: > 1. No, there are no user definable language codes reserved in ISO 639. This is not exactly true, ISO 639-3 reserves some codes for _local use_, see http://sil.org/iso639-3/scope.asp#R | Reserved for local use | | Identifiers qaa through qtz are reserved for local use, to be used in | cases in which there is no suitable existing code in ISO 639. There are | no constraints as to scope of denotation. These identifiers may only be | used locally, and may not be used in interchange without a private | agreement. Surely a distributed OpenOffice.org and documents created with it would not meet the "private agreement" requirement.
Hi, I am sorry for the late reply. I was ill. How can we go on now? Can we follow Plan C until mongolian discrete locale described?
No, even Plan C isn't possible. So we can do what I mentioned in my last comment: integrate the traditional layout support, but without the ability to have linguistic support for it. The language attribute of the text will be incorrect, with the consequence that spell checking is impossible and the break iterator will fall back to a default behavior (AFAIK word breaks will be blanks only).
To cut a long story short, we should stay with Plan A. For that we don't have to do any changes. Should I close now this issue or can we consider it for 3.X?
Now you are confusing me. I already have integrated the attached patch and we need this issue to continue. "Plan A" was only about the missing locale support and this is covered by issue 88665. So if you are fine with what you had in your personal work space before you created the patch, we can continue with the CWS. I will try to hand it over to QA as soon as possible if you agree.
I thought that I started new issue for #desc7. It confused you and me, Sorry mba! I agree with you 100%. We have to continue it as usual. So, I reset the target of this issue to 3.1. Is it ok? Why I choosed "Plan A" is: 1. Cyrillic is official language in Mongolia 2. We have also developed a mongolian cyrillic spellchecker which works really nice and used everyday in Mongolia. 3. Currently* we don't have mongolian (traditional) spellchecker. Is there any possibility to support mongolian cyrillic spellchecker in Plan B? ;-)
Not really. We will now follow "Plan A" and then hope to get "Plan D" (script type support) added later. Please understand that I can't predict in which release we might get that fixed. But IMHO we should fix it in a not too distant future.
I'm very sorry, but we have not been able to apply a sufficient testing. The patch is not trivial and needs quite some testing. The plan now is to do the remaining QA as soon as the 3.1 release is done and integrate the CWS into the 3.2 code line as early as possible. Sorry again.
fixed in cws mongolianlayout
I have builds now for all platforms. Our QA will do the regression testing, will you be able to test if the build still is OK for your feature? For which platform(s) do you need builds for testing?
Although people are trying to introduce Linux into Mongolia, I'd guess that over 90% still use Windows, so that's the most important platform to get working initially. I can do some testing, provided it's possible to install the new version in parallel to my existing copy of our Mongolian version of release 3.0 - I don't want to remove or change that because I use it every day. If having both versions installed in parallel is possible, is there anything I particularly need to do when I install the new version in order to make sure that my old version remains unchanged?
It's easy for Windows, you can just use "setup.exe /a" to create a parallel installation. For Linux I could provide a tar.gz that does not need installation.
I've uploaded a Windows installation set and a Linux tar to qa-upload.services.openoffice.org in the folder "mongolianlayout". In case you install the Windows build next to another OOo 3.x version please use "setup.exe /a".
erdee/badaa: can anyone of you verify that the build works for you? We would like to integrate the CWS soon.
Sorry for not replying earlier. I tried the Windows version, but it deleted my previous version when I installed it even though I used setup.exe /a. Is there something else I should have done to stop it deleting the old version? It also made my computer VERY slow so I couldn't do much with it. This might be my computer's fault; I'm not sure. (I will be upgrading my computer later this month so I can try it again with the new one when I get it.) It did run when I installed it, even though it was so slow. However, I was expecting to be able to write vertical text (not necessarily Mongolian) from left to right, and I couldn't. Have I misunderstood what this fix should do?
All you have described sounds very strange. As I just created a new build based on milestone m45 I will upload this one tomorrow. I don't have an idea what went wrong.
I have uploaded new builds. I didn't have any problems with installing or using the uploaded Windows version, at least not with any of my documents.
Sorry for my too late reply. I was on vacation where the Internet connection is very poor. qa-upload.services.openoffice.org had been down all day.
qa-upload now should be reachable, see http://wiki.services.openoffice.org/wiki/CWS_upload
Thanks mba! I got your builds and choosed linux tarballs to test. But I can't find in TTB LTR direction in format->page->text direction by activating Enabled for complex text layout (CTL) from Enhanced language support and with or without mongolian entry in Western for default language for document section as well as in CTL none or something choosed. Bo success for all combinations. How should I activate simply the direction? :)
That's interesting. I applied the attached patch and fixed some merge conflicts. The other patch in issue 91227 did not contain anything that wasn't already present. Are you sure that your patch in issue 91227 (that might be the missing piece) is correct and complete? Maybe it was a reversed patch and so it could not be applied? I will have a look at this patch again.
I have uploaded a new Linux build to the ftp server. Please give it a try. I could also provide a new Windows build in a few hours, if necessary.
verified. It works at least like my local build. Thanks MBA!
I made another code review and found some changes that I would like to discuss. Let's do it one after another. First: In SwCrsrShell::UpDown() code was added that reverted bUp eventually. IMHO this is not the right place for this, all writing direction dependent code should be where the cursor keys are handled, in the KeyInput method of the EditWindow. In the cursor shell "up" and "down" are only logical directions, means: for e.g. traditional mongolian layout it means "left" and "right" on the screen. Conversion from "phyical screen direction" to "logical direction" should be done before calling this method, otherwise e.g. the TextCursor UNO API calls that already use logical directions and forwards to SwCrshrShell:UpDown() wouldn't work correctly.
Verified in CWS mongolianlayout.
Evaluation of the Mongolian layout in Writer reveals the following defects: (1) "Spacing - Above paragraph" has to be applied on the left of the paragraph, but it is applied on the right. (2) "Spacing - Below paragraph" has to be applied on the right of the paragraph, but it is applied on the right of the following paragraph. (3) The dialog panes Menu Format - Frame/Object - pane Type and Format - Object - Position and Size - pane Position and Size does not show the correct alignment strings for the horizontal positioning. (4) The positioning of anchored objects (Writer text frames, Writer graphics, Writer embedded objects, Drawing object and form controls) does not work correctly. od->badaa: I am correctly working on defects (3) and (4) and will hopefully attach the corresponding patches to this issue. I will also commit the changes to CWS mongolianlayout. Could You please have a look at (1) and (2)?
CC myself
Further found defect: (5) The dialog panes Menu Format - Frame/Object - pane Type and Format - Object - Position and Size - pane Position and Size does work correctly regarding the minimum and maximum values for the vertical and horizontal positions. I am also working on this defect together with (3)
Another defect I have found: (6) When the relation between page size and zoom allows it to place several pages beside each other in the view, the pages are arranged from right-to-left. This has to be changed in the Mongolian layout (vertical left-to-right layout). od->badaa: Please have a look.
adjustment to the object positioning for Mongolian layout - changed files: /sw/source/core/anchoredobjectposition.hxx, /sw/source/core/objectpositioning/anchoredobjectposition.cxx, /sw/source/core/objectpositioning/tocntntanchoredobjectposition.cxx, /sw/source/core/objectpositioning/tolayoutanchoredobjectposition.cxx, rev. 275804
Created attachment 64546 [details] changes to object positioning for Mongolian layout
adjustment to the object positioning dialog for Mongolian layout -changed files: /sw/inc/fesh.hxx, /sw/source/core/frmedt/fews.cxx, /sw/source/ui/inc/frmpage.hxx, /sw/source/ui/inc/frmmgr.hxx, /sw/source/ui/frmdlg/frmpage.cxx, /sw/source/ui/frmdlg/frmmgr.cxx, /sw/source/ui/docvw/edtwin.cxx, /sw/source/ui/shells/drwbassh.cxx, /sw/source/ui/uiview/viewtab.cxx, rev. 275809
Created attachment 64548 [details] adjustments to the object positioning dialog
I have provided patches to solve found defects (4) and (3)+(5). I have also commit these patches to cws mongolianlayout. There may be some minor defects left in the object positioning and/or object positioning dialog. Please have a closer look.
@badaa: I'm sorry, but there are too many problems left, I have reopened this issue. Can you support Oliver?
Hi Od and Mba, Thanks for your corrections and contributions! Sorry for my dissipation. I was really busy for last 8 months and ill from 20.08.09 until now. (dorsal problem) I will look and support Oliver's contributions as soon as possible.
Adjusting issue target to 3.2 because CWS mongolianlayout was shifted to target OOo 3.2.
It was decided to integrate the changes if all tests are passed, but not activate the new writing direction in the user interface. So no additional testing is required now.
Set target OOo 3.4
Verified in CWS mongolianlayout. Traditional mongolian writing direction can not be set via UI any more.
Please enable it in UI by Plan B. I submitted seperate issue under https://issues.apache.org/ooo/show_bug.cgi?id=121853