Forrest
  1. Forrest
  2. FOR-18

support multiple languages (i18n)

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.7
    • Component/s: Core operations
    • Labels:
      None

      Description

      In my current environment to develop static mulitlingual web-sites, I use an ant build.xml and the m4 macro preprocessor to achieve the following (sample):
      1) index.en.m4 gets converted to index.en.html
      The *.en.m4 contains all language dependent text (similarly *.de.m4 for German) and includes
      index.m4 that contains the page's content layout.
      [(^\.)+].m4 includes sitedef.m4 where I define all global parts of the website (e.g. navigation structure, unique content e.g. phone numbers, filenames, etc.). This in turn includes a sitedefs.en or sitedef.de, ... respectively for global, language dependent definitions.
      2) Dependencies
      a) upon change of [(^\.)+].m4, all depending *.*LANG*.html get rebuilt
      b) upon change of sitedef.m4, build.xml, and alike all *.html gets rebuilt
      c) upon change of sitedefs.en all *.en.html get rebuilt.

      Obviously, I could use the exact same approach to create .xml whereever I created .html before, but my long-term goal is to get rid of m4. Has anybody already put some thought into how this would be done with forrest?

        Issue Links

          Activity

          Juan Jose Pablos made changes -
          Fix Version/s 0.7-dev [ 12310031 ]
          Resolution Fixed [ 1 ]
          Status Open [ 1 ] Closed [ 6 ]
          Hide
          Juan Jose Pablos added a comment -
          Close this issue as forrest has i18n support. most of the information reflected here is not longer to do with the resolution of this issue.
          Show
          Juan Jose Pablos added a comment - Close this issue as forrest has i18n support. most of the information reflected here is not longer to do with the resolution of this issue.
          Hide
          Thorsten Scherler added a comment -
          The linked issue has to be fully resolved and incooperated into the solution found by this issue.
          Show
          Thorsten Scherler added a comment - The linked issue has to be fully resolved and incooperated into the solution found by this issue.
          Thorsten Scherler made changes -
          Link This issue blocks FOR-506 [ FOR-506 ]
          Hide
          Juan Jose Pablos added a comment -
          I put some information about how to create static site with forrest here:
          http://casa.che-che.com/blog/2005/05/10/internalization-a-site-using-forrest-07-dev/
          Show
          Juan Jose Pablos added a comment - I put some information about how to create static site with forrest here: http://casa.che-che.com/blog/2005/05/10/internalization-a-site-using-forrest-07-dev/
          Hide
          Juan Jose Pablos added a comment -
          I have commit the adaptation with org.apache.cocoon.matching.LocaleMatcher.

          but this adaptation requiered to use index.lang.xml instead of index_lang.xml
          Show
          Juan Jose Pablos added a comment - I have commit the adaptation with org.apache.cocoon.matching.LocaleMatcher. but this adaptation requiered to use index.lang.xml instead of index_lang.xml
          Hide
          Juan Jose Pablos added a comment -
          The fall-back implementation with the i18n transformer is not working on cocoon. I had filed a bug on this:
          http://issues.apache.org/bugzilla/show_bug.cgi?id=32560
          Show
          Juan Jose Pablos added a comment - The fall-back implementation with the i18n transformer is not working on cocoon. I had filed a bug on this: http://issues.apache.org/bugzilla/show_bug.cgi?id=32560
          Hide
          Juan Jose Pablos added a comment -
          Show
          Juan Jose Pablos added a comment - created an issue on cocoon. http://issues.apache.org/bugzilla/show_bug.cgi?id=32231
          Hide
          Juan Jose Pablos added a comment -
          The Cocoon LocalAction changed its API. So that breaks the i18n stuff. Please test if that is working for you.
          Cheers,
          Show
          Juan Jose Pablos added a comment - The Cocoon LocalAction changed its API. So that breaks the i18n stuff. Please test if that is working for you. Cheers,
          Hide
          Rupert BARROW added a comment -
          Sorry about this, but I cannot get i18n to work in Forrest 0.6
          I have read everything about i18n-foo.html, foo_locale.html, translations/language, etc.

          I have not been able to translate menus, tabs or contents.

          Please could someone point me to the right direction ? I promise to contribute a WORKING default forrest seed site if you get me running ...

          Thanks,
          Rupert
          Show
          Rupert BARROW added a comment - Sorry about this, but I cannot get i18n to work in Forrest 0.6 I have read everything about i18n-foo.html, foo_locale.html, translations/language, etc. I have not been able to translate menus, tabs or contents. Please could someone point me to the right direction ? I promise to contribute a WORKING default forrest seed site if you get me running ... Thanks, Rupert
          David Crossley made changes -
          Component/s Core operations
          Juan Jose Pablos made changes -
          Summary support mulitple languages support multiple languages (i18n)
          Hide
          Juan Jose Pablos added a comment -
          Added support for single files.
          for a request foo.html forrest will look for foo_lang.xml
          Show
          Juan Jose Pablos added a comment - Added support for single files. for a request foo.html forrest will look for foo_lang.xml
          Hide
          Juan Jose Pablos added a comment -
          I added i18n support for the tabs as well. If you want to test it:

          "Forrest seed"
          Add
          i18n=true
          to your forrest.properties, and "forrest run"

          Check under "src/documentation/translations" for the files.
          Show
          Juan Jose Pablos added a comment - I added i18n support for the tabs as well. If you want to test it: "Forrest seed" Add i18n=true to your forrest.properties, and "forrest run" Check under "src/documentation/translations" for the files.
          Hide
          Juan Jose Pablos added a comment -
          OK, I got a first version:
          Allows menu labels to be displayed in another language out of a catalog. Currently there is two languages Spanish [es] and Italian [it]

          It works but forrest needs are a bit further.

          It has been added to CVS for a demostration:

          add project.i18n=true to forrest.properties
          run:

          forrest seed
          forrest run
          Show
          Juan Jose Pablos added a comment - OK, I got a first version: Allows menu labels to be displayed in another language out of a catalog. Currently there is two languages Spanish [es] and Italian [it] It works but forrest needs are a bit further. It has been added to CVS for a demostration: add project.i18n=true to forrest.properties run: forrest seed forrest run
          Juan Jose Pablos made changes -
          Comment 10328
          Juan Jose Pablos made changes -
          Assignee cheche
          Hide
          Raymond Penners added a comment -
          I am a Forrest novice, but would really like to use it for our web site (www.workrave.org). For this purpose, Forrest would need to generate a static, multi-lingugal web site, that will be served by Apache using content negotiation.

          Is this currently possible?

          I found some references to the i18n:translator, but I could not find a proper example of how to set this up.
          Show
          Raymond Penners added a comment - I am a Forrest novice, but would really like to use it for our web site ( www.workrave.org ). For this purpose, Forrest would need to generate a static, multi-lingugal web site, that will be served by Apache using content negotiation. Is this currently possible? I found some references to the i18n:translator, but I could not find a proper example of how to set this up.
          Hide
          Juan Jose Pablos added a comment -
          Babelfish allows to navigate a site with systran translation but anyone is welcome to forward any patches if feels that this is necessary.

          Whatever is the solution, it should use the same pattern as httpd to reduce the learnirng curve:
          http://httpd.apache.org/docs/content-negotiation.html

          I like the idea of using I18n transformer for menus:
          http://cocoon.apache.org/2.1/userdocs/transformers/i18n-transformer.html
          Show
          Juan Jose Pablos added a comment - Babelfish allows to navigate a site with systran translation but anyone is welcome to forward any patches if feels that this is necessary. Whatever is the solution, it should use the same pattern as httpd to reduce the learnirng curve: http://httpd.apache.org/docs/content-negotiation.html I like the idea of using I18n transformer for menus: http://cocoon.apache.org/2.1/userdocs/transformers/i18n-transformer.html
          Hide
          Rupert BARROW added a comment -
          Hi,
          I'm more of a Forrest potential user, rather that a Forrest-Cocoon techie.

          My idea of I18n in Forrest-generated sites is the following :

          - allow for several schemes for I18n content layout :
          1) all-in-one file for synchonous multi-language contents publication; this is rather rare, in real life
          2) dictionaries or catalogs : I don't really believe in these, but they are probably useful, as shown in the Cocoon samples
          3) I woul prefer 1 XML source file per language

          - make a partially translated wbe site look completely translated, with the help of e.g. BabelFish (babelfish.altavista.com) automatic translation. The idea is that I translate the important parts of my web site myself (and manage the multi-lingual content for these pages), whereas I rely on BabelFish links (could Forrest use things like Cocoon "transformers" ?) to translate other pages.

          BabelFish could translate source XML or published HTML. In either case, I would like to cache the translated results : if XML is translated, my local translator can re-edit the cached page and I could then include the result in the XML source contents of my site.

          I would also like to be able to accept community-contributed edited translations of previously automatically-translated, poor quality, pages, once again to include a better quality translation in my site XML source contents.

          What do you think ?
          Show
          Rupert BARROW added a comment - Hi, I'm more of a Forrest potential user, rather that a Forrest-Cocoon techie. My idea of I18n in Forrest-generated sites is the following : - allow for several schemes for I18n content layout : 1) all-in-one file for synchonous multi-language contents publication; this is rather rare, in real life 2) dictionaries or catalogs : I don't really believe in these, but they are probably useful, as shown in the Cocoon samples 3) I woul prefer 1 XML source file per language - make a partially translated wbe site look completely translated, with the help of e.g. BabelFish (babelfish.altavista.com) automatic translation. The idea is that I translate the important parts of my web site myself (and manage the multi-lingual content for these pages), whereas I rely on BabelFish links (could Forrest use things like Cocoon "transformers" ?) to translate other pages. BabelFish could translate source XML or published HTML. In either case, I would like to cache the translated results : if XML is translated, my local translator can re-edit the cached page and I could then include the result in the XML source contents of my site. I would also like to be able to accept community-contributed edited translations of previously automatically-translated, poor quality, pages, once again to include a better quality translation in my site XML source contents. What do you think ?
          Ralf Hauser made changes -
          Field Original Value New Value
          Attachment multLang.zip
          Hide
          Ralf Hauser added a comment -
          index.xml adapted for en (and in a few minutes, it could provide a "de" version as well if needed).
          The "include: doesn't work so far despite having extended my sitemap.xmap with <map:transformer name="xinclude" src="org.apache.cocoon.transformation.XIncludeTransformer"/>

          For setting my variables, I used the velocity syntax - I understand that this is not cocoon-compliant, but I am sure an equivalent syntax for cocoon exists...
          Show
          Ralf Hauser added a comment - index.xml adapted for en (and in a few minutes, it could provide a "de" version as well if needed). The "include: doesn't work so far despite having extended my sitemap.xmap with <map:transformer name="xinclude" src="org.apache.cocoon.transformation.XIncludeTransformer"/> For setting my variables, I used the velocity syntax - I understand that this is not cocoon-compliant, but I am sure an equivalent syntax for cocoon exists...
          Hide
          Ralf Hauser added a comment -
          Jeff, As a novice to XML, you mention in your referenced 021227 mailing-list post: "... doesn't lock users into using only Forrest. Only problem is, it assumes rather more XML knowledge than I'd expect most doc editors would have." I am one of those who needs a simple solution - at least to start with.

          You asked for real-life examples - I will attach one next.

          Re the later posts:
          1) the hierarchy en-US --> en --> nothing looks good to me, but because I also want to cater to
          1a) people who do not know how to change the locale on their PC or
          1b) who visit an internet cafe/use someone else's machine and should mess with the configuration, this hierarchy approach wouldn't obsolete my "language-Switcher div".
          Since all files along Konstantin's hierarchy have the .xml extension and are possibly self-standing, I named my layout file .xmi (like xml include) since on its own, it will not be intelligible.
          2) Maintaining parallel URI spaces looks awkward to me because the layout hopefully is redundant for 99.9% and
          2a) the verbosity differences can hopefully be handled with proper browser rendering (e.g. Mozilla:viewport?)
          2b) I have my "language-Switcher div" in m4 defined such that with setting a single property, instead of letting to switch between languages, it alerts that the very page is only available in English.
          3) While I agree with Steven that it is confusing having multilingual versions handled inside one document, I fear that the same argument will be applied to my structure of having a single page pieced together out of a content file and a layout file among others. My counter-argument would be:
          3a) Thanks to cocoon, despite not being WYSIWYG, a simple reload on localhost:8888 lets one instantly experience the effect of changes to any of the files undertaken
          3b) an ant-like up-to-date mechanism could at least help the editors to keep translations "in sync":
          When a .en.xmi is newer (assuming english is the master version), the .de.xml, .fr.xml, .??.xml will get a warning <echo>.
          Obviously such a warning can be avoided by a simple "touch" command, but a reasonable editor wouldn't do that. Sure - this obviously doesn't ensure that the translations are semantically identical, but at least, it will become very obvious, if after adding an addition $text_n+1, this variable name instead of some additional text becomes visible in for example the de.html - then most likely, the editor forgot to update the .de.xml content-wise!
          Show
          Ralf Hauser added a comment - Jeff, As a novice to XML, you mention in your referenced 021227 mailing-list post: "... doesn't lock users into using only Forrest. Only problem is, it assumes rather more XML knowledge than I'd expect most doc editors would have." I am one of those who needs a simple solution - at least to start with. You asked for real-life examples - I will attach one next. Re the later posts: 1) the hierarchy en-US --> en --> nothing looks good to me, but because I also want to cater to 1a) people who do not know how to change the locale on their PC or 1b) who visit an internet cafe/use someone else's machine and should mess with the configuration, this hierarchy approach wouldn't obsolete my "language-Switcher div". Since all files along Konstantin's hierarchy have the .xml extension and are possibly self-standing, I named my layout file .xmi (like xml include) since on its own, it will not be intelligible. 2) Maintaining parallel URI spaces looks awkward to me because the layout hopefully is redundant for 99.9% and 2a) the verbosity differences can hopefully be handled with proper browser rendering (e.g. Mozilla:viewport?) 2b) I have my "language-Switcher div" in m4 defined such that with setting a single property, instead of letting to switch between languages, it alerts that the very page is only available in English. 3) While I agree with Steven that it is confusing having multilingual versions handled inside one document, I fear that the same argument will be applied to my structure of having a single page pieced together out of a content file and a layout file among others. My counter-argument would be: 3a) Thanks to cocoon, despite not being WYSIWYG, a simple reload on localhost:8888 lets one instantly experience the effect of changes to any of the files undertaken 3b) an ant-like up-to-date mechanism could at least help the editors to keep translations "in sync": When a .en.xmi is newer (assuming english is the master version), the .de.xml, .fr.xml, .??.xml will get a warning <echo>. Obviously such a warning can be avoided by a simple "touch" command, but a reasonable editor wouldn't do that. Sure - this obviously doesn't ensure that the translations are semantically identical, but at least, it will become very obvious, if after adding an addition $text_n+1, this variable name instead of some additional text becomes visible in for example the de.html - then most likely, the editor forgot to update the .de.xml content-wise!
          Hide
          Konstantin Piroumian added a comment -
          Absolutely agree with Steven.

          For multilingual content its possible to use locale-sensitive sitemap aggregation, so the body part of the document could be in the selected language.

          I've been thinking about either a i18n protocol or an input module to be used to detect if a file exists for the given locale using the same scheme as used to retrieve the message catalogues, e.g.:
          Given
            locale - en_US
            requested - index[.xml]
          Result
            index_en_US.xml, if not found then
            index_en.xml, if not found then
            index.xml, if not found then 404 error
          or
            /en/US/index.xml
            /en/index.xml
            index.xml

          something like this.

          It'd be fine if we could invent some mechanism for this kind of things, e.g.:

            <map:generate src="i18n:file:/{1}" /> or better via an input module
            <map:generate src="{i18n:file/{1}}" /> - but this is not possible currently.

          (You know, I have a trigger acting on 'i18n', 'multilingual' words etc. ;))
           
            
          Show
          Konstantin Piroumian added a comment - Absolutely agree with Steven. For multilingual content its possible to use locale-sensitive sitemap aggregation, so the body part of the document could be in the selected language. I've been thinking about either a i18n protocol or an input module to be used to detect if a file exists for the given locale using the same scheme as used to retrieve the message catalogues, e.g.: Given   locale - en_US   requested - index[.xml] Result   index_en_US.xml, if not found then   index_en.xml, if not found then   index.xml, if not found then 404 error or   /en/US/index.xml   /en/index.xml   index.xml something like this. It'd be fine if we could invent some mechanism for this kind of things, e.g.:   <map:generate src="i18n:file:/{1}" /> or better via an input module   <map:generate src="{i18n:file/{1}}" /> - but this is not possible currently. (You know, I have a trigger acting on 'i18n', 'multilingual' words etc. ;))     
          Hide
          Steven Noels added a comment -
          We had this discussion before, and in other places as well. Adding i18n on navigation labels would be interesting, if supported by the site.xml scheme (an argument to switch to name/value pairs IMHO).

          i18n content is something different however. I believe multilingual versions shouldn't be handled inside one document, for a number of reasons:

          * it is confusing to the document editor
          * there's a good chance not everything is translated in very language
          * typically, the verbosity between language variants can be different

          so I would be +1 on navigation i18n, as we did for xreporter too, but not for document content: different languages will reside in parallel URI spaces and should be edited accordingly (i.e. separate)

          All IMHO, of course. Konstantin is our i18n master.
          Show
          Steven Noels added a comment - We had this discussion before, and in other places as well. Adding i18n on navigation labels would be interesting, if supported by the site.xml scheme (an argument to switch to name/value pairs IMHO). i18n content is something different however. I believe multilingual versions shouldn't be handled inside one document, for a number of reasons: * it is confusing to the document editor * there's a good chance not everything is translated in very language * typically, the verbosity between language variants can be different so I would be +1 on navigation i18n, as we did for xreporter too, but not for document content: different languages will reside in parallel URI spaces and should be edited accordingly (i.e. separate) All IMHO, of course. Konstantin is our i18n master.
          Hide
          Konstantin Piroumian added a comment -
          It's possible to use i18n transformer (from Cocoon 2.1-dev) to generate all the menu labels and other navigation/site parts using multilingual dictionaries.

          As for the content of the documents it's possble to use either XInclude/CInclude capabilities to include content in the needed language or use latest i18n markup extensions:
          <i18n:choose>
            <i18n:when locale="en">English content</i18n:when>
            <i18n:when locale="ru">&#1056;&#1091;&#1089;&#1089;&#1082;&#1086;&#1077; &#1089;&#1086;&#1076;&#1077;&#1088;&#1078;&#1072;&#1085;&#1080;&#1077;</i18n:when>
            <i18n:otherwise>{Some other content}</i18n:otherwise>
          </i18n:choose>

          Also, it's possible to use LocaleAction (or better an input module) to obtain locale information in the sitemap and aggregate document's contents based on the selected locale.
          Show
          Konstantin Piroumian added a comment - It's possible to use i18n transformer (from Cocoon 2.1-dev) to generate all the menu labels and other navigation/site parts using multilingual dictionaries. As for the content of the documents it's possble to use either XInclude/CInclude capabilities to include content in the needed language or use latest i18n markup extensions: <i18n:choose>   <i18n:when locale="en">English content</i18n:when>   <i18n:when locale="ru">&#1056;&#1091;&#1089;&#1089;&#1082;&#1086;&#1077; &#1089;&#1086;&#1076;&#1077;&#1088;&#1078;&#1072;&#1085;&#1080;&#1077;</i18n:when>   <i18n:otherwise>{Some other content}</i18n:otherwise> </i18n:choose> Also, it's possible to use LocaleAction (or better an input module) to obtain locale information in the sitemap and aggregate document's contents based on the selected locale.
          Hide
          Jeff Turner added a comment -
          I think you could implement this Cocoon's XInclude transformer (http://xml.apache.org/cocoon/userdocs/transformers/xinclude-transformer.html), which allows

          So index.en.html would have:

          <html>
          <body>
            <!-- Include language-neutral layout: -->
            <xi:include href="index-layout.xml"/>
            Blimey, .en content.
          </body>
          </html>

          The Cocoon caching system will handle 'dependencies'. Eg, if index-layout.xml changes, then the next request for index.en.html will include those changes.

          With the xpointer() scheme, the XInclude would be able to address specific nodes in a sitedefs file. A bit horrible though.. there was a long thread on how best to do a 'sitedefs' file in Forrest:

          http://marc.theaimsgroup.com/?l=forrest-dev&m=104099654113176&w=2

          Any real-life examples you could provide would be interesting.

          --Jeff


          Show
          Jeff Turner added a comment - I think you could implement this Cocoon's XInclude transformer ( http://xml.apache.org/cocoon/userdocs/transformers/xinclude-transformer.html), which allows So index.en.html would have: <html> <body>   <!-- Include language-neutral layout: -->   <xi:include href="index-layout.xml"/>   Blimey, .en content. </body> </html> The Cocoon caching system will handle 'dependencies'. Eg, if index-layout.xml changes, then the next request for index.en.html will include those changes. With the xpointer() scheme, the XInclude would be able to address specific nodes in a sitedefs file. A bit horrible though.. there was a long thread on how best to do a 'sitedefs' file in Forrest: http://marc.theaimsgroup.com/?l=forrest-dev&m=104099654113176&w=2 Any real-life examples you could provide would be interesting. --Jeff
          Ralf Hauser created issue -

            People

            • Assignee:
              Juan Jose Pablos
              Reporter:
              Ralf Hauser
            • Votes:
              5 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development