Uploaded image for project: 'Spatial Information Systems'
  1. Spatial Information Systems
  2. SIS-137

<gmd:LocalisedCharacterString> locale shall be a URI

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 0.3
    • Fix Version/s: None
    • Component/s: Metadata, Utilities
    • Labels:
      None

      Description

      The locale attribute in the <gmd:LocalisedCharacterString> element is defined by the XML schema as a value of kind xs:anyURI. However SIS 0.3 handles it as a plain string containing directly the language code, except for the "#locale-" prefix (if present) which is ignored. This is incomplete, and may be wrong under some circumstances.

      ISO 19139:2007 at pages 105 and 106 defines the management of multilingual metadata. A character string is localized as the example below:

      <PT_FreeText>
        <textGroup>
          <LocalisedCharacterString locale="#locale-fr">Résumé succinct du contenu de la ressource</LocalisedCharacterString>
        </textGroup>
      </PT_FreeText>
      

      The locale="#locale-fr" attribute references a locale definition provided elsewhere, typically as an element of the root metadata:

      <MD_Metadata>
        <locale>
          <PT_Locale id="locale-fr">
            <languageCode>
              <LanguageCode codeList="resources/Codelist/gmxcodelists.xml#LanguageCode" codeListValue="fra"> French </LanguageCode>
            </languageCode>
          </PT_Locale>
        </locale>
      </MD_Metadata>
      

      Since the locale attribute is a URI referencing an other element, that attribute value typically begins with # character, while the id attribute in <PT_Locale> does not. However there is nothing in the specification telling that the locale ID shall be prefixed by "locale-", neither that the text after that prefix shall be the language code. It just happen to be the convention followed in the examples given by the ISO specification.

      A search on internet shows that this attribute is used in various ways:

      French mapping agency (IGN)

      Extract from ML_gmxCrs.xml:

      <gmd:PT_FreeText>
        <gmd:textGroup>
          <gmd:LocalisedCharacterString locale="#xpointer(//*[@id='fra'])">Catalogue des paramètres géodésiques pour la description de jeux de métadonnées conformes aux schémas gmx</gmd:LocalisedCharacterString>
        </gmd:textGroup>
      </gmd:PT_FreeText>
      
      <locale>
        <gmd:PT_Locale id="fra">
          <gmd:languageCode>
            <gmd:LanguageCode codeList="../codelist/ML_gmxCodelists.xml#LanguageCode" codeListValue="french">French</gmd:LanguageCode>
          </gmd:languageCode>
        </gmd:PT_Locale>
      </locale>
      

      Observations:

      • The locale attribute value is given by a XPath.
      • The codeListValue attribute value in LanguageCode is "French" instead than an ISO language code.

      NOAA

      Extract from Cruise2ISO on geo-ide:

      <gmd:LocalisedCharacterString id="PERSON_NAME_ID" locale="http://www.rvdata.us/person#6708">PERSON/NAME<gmd:LocalisedCharacterString>
      

      Observations:

      • The locale attribute is a URL to a distant resource. However in this particular case attempts to fetch that resource give an error 404. Consequently there is no obvious way to find the locale for that example.

      INSPIRE

      Extract from Google code:

      <gmd:LocalisedCharacterString locale="en-GB">House</gmd:LocalisedCharacterString>
      

      Observations:

      • The locale attribute contains directly a parseable ISO language code. The absence of leading # suggest that there is no need to search for a definition elsewhere. This is the easiest case and is supported by current Apache SIS.

      Other

      Extract from a mailing list:

      <gmd:LocalisedCharacterString locale="#frFR">Montréal</gmd:LocalisedCharacterString>
      

      Observations:

      • This is not really a parseable ISO code because of the missing - character. We would expect a definition to be provided elsewhere because of the # prefix, while the extract from the mail archive does not show it.

      Work needed in SIS

      The fact that the <PT_Locale> elements providing locale definitions may appear after the localized string complicates the handling. One possible approach would be to create an internal object that keep a reference to a DefaultInternationalString, a String and a locale ID, then invoke the DefaultInternationalString.add(Locale, String) method at some later time when the Locale become known.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                desruisseaux Martin Desruisseaux
                Reporter:
                desruisseaux Martin Desruisseaux
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated: