Uploaded image for project: 'Directory Client API'
  1. Directory Client API
  2. DIRAPI-219

DateUtils.toGeneralizedTime does not work with some Locales

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.0.0-M28
    • Fix Version/s: 1.0.0-M29, 1.0.0-M32
    • Labels:
      None

      Description

      Over in SOLR-6915 I've run into an issue with a few Locales when trying to use Apache Directory Server via the Hadoop MiniKDC. Here's an example failure that happens on JDK8: https://issues.apache.org/jira/browse/SOLR-6915?focusedCommentId=14287516&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14287516

      The locales that have a problem for me are:
      th_TH_TH_#u-nu-thai
      ja_JP_JP_#u-ca-japanese
      hi_IN

      I've tracked these problems to DateUtils.toGeneralizedTime(), which is returning the following to me for these 3 locales, respectively:
      ๒๕๕๘๐๑๒๗๐๑๐๘๐๖.๙๒๙Z
      270127010806.259Z
      २०१५०१२७०१०८०६.०४०Z

        Activity

        Hide
        thetaphi Uwe Schindler added a comment -

        OK, thanks! Sorry for the late feedback. I just checked your changes yesterday because we got another Locale problem yesterday (on Java 9 previews). I will keep track on updating to latest version (once this is in) in Apache Solr.

        There are still some similar problems with BouncyCastle (I think), which is also using default locale sometimes, see SOLR-6915.

        Show
        thetaphi Uwe Schindler added a comment - OK, thanks! Sorry for the late feedback. I just checked your changes yesterday because we got another Locale problem yesterday (on Java 9 previews). I will keep track on updating to latest version (once this is in) in Apache Solr. There are still some similar problems with BouncyCastle (I think), which is also using default locale sometimes, see SOLR-6915 .
        Hide
        elecharny Emmanuel Lecharny added a comment -

        I have used calendar = new GregorianCalendar(GMT, Locale.ROOT)

        Regarding the other pb (mainly with Strings), we are a bit covered, as we only manipulate UTF-8 Strings, per teh LDAP RFC requirement. Wrll, I hope we are safe

        http://svn.apache.org/r1697165

        Show
        elecharny Emmanuel Lecharny added a comment - I have used calendar = new GregorianCalendar(GMT, Locale.ROOT) Regarding the other pb (mainly with Strings), we are a bit covered, as we only manipulate UTF-8 Strings, per teh LDAP RFC requirement. Wrll, I hope we are safe http://svn.apache.org/r1697165
        Hide
        thetaphi Uwe Schindler added a comment - - edited

        Hi thanks for the quick change! The above commit was not the one changing the calendar, so I was a bit confused (it was the previous one): http://svn.apache.org/r1697148

        This should now be fine, but a bit confusing, so I suggest to change this a bit:

        • GregorianCalendar.getInstance does not exist, its a static method in Calendar (Eclipse or PMD should print a warning about calling a static method on subclass). So either change to Calendar.getInstance(Locale.ROOT) or Calendar.getInstance(Locale.ENGLISH); or much better:
        • use the constructor: calendar = new GregorianCalendar(GMT, Locale.ROOT). By that you also don't need to use the setters and its explicit which calendar you want to use (see documentation and BNF requiring GregorianCalendar)! Relying on the Locale is not correct, because suddenly ENGLISH may change to another Calendar (this would be extreme - for sure, but we dont want to rely on country or language here).

        Thanks for investigating forbidden-apis, we implemented that because of such problems in Apache Lucene and Solr whoich are very language agnostic, so it is very important to be explicit! If you have any question we can help. It is also useful to "forbid" your own definitions. E.g. our test framework does not allow threads/threadpools without names and this is also enforced by the tool.

        Show
        thetaphi Uwe Schindler added a comment - - edited Hi thanks for the quick change! The above commit was not the one changing the calendar, so I was a bit confused (it was the previous one): http://svn.apache.org/r1697148 This should now be fine, but a bit confusing, so I suggest to change this a bit: GregorianCalendar.getInstance does not exist, its a static method in Calendar (Eclipse or PMD should print a warning about calling a static method on subclass). So either change to Calendar.getInstance(Locale.ROOT) or Calendar.getInstance(Locale.ENGLISH) ; or much better: use the constructor: calendar = new GregorianCalendar(GMT, Locale.ROOT) . By that you also don't need to use the setters and its explicit which calendar you want to use (see documentation and BNF requiring GregorianCalendar)! Relying on the Locale is not correct, because suddenly ENGLISH may change to another Calendar (this would be extreme - for sure, but we dont want to rely on country or language here). Thanks for investigating forbidden-apis, we implemented that because of such problems in Apache Lucene and Solr whoich are very language agnostic, so it is very important to be explicit! If you have any question we can help. It is also useful to "forbid" your own definitions. E.g. our test framework does not allow threads/threadpools without names and this is also enforced by the tool.
        Hide
        elecharny Emmanuel Lecharny added a comment -

        I hope it's fixed with the latest patch.

        Uwe, please feel free to reopen the issue if what I committed is not enough. I'm not a Calendar god...

        Show
        elecharny Emmanuel Lecharny added a comment - I hope it's fixed with the latest patch. Uwe, please feel free to reopen the issue if what I committed is not enough. I'm not a Calendar god...
        Hide
        elecharny Emmanuel Lecharny added a comment -
        Show
        elecharny Emmanuel Lecharny added a comment - Patch applied : http://svn.apache.org/r1697149
        Hide
        elecharny Emmanuel Lecharny added a comment -

        Uwe,

        FTR, the Calendar.createCalendar class returns a GregorianCalendar when the Locale is set to Locale.ENGLISH.

        Anyway, I'll change the code accordingly. Btw, interesting reading (Policeman-tools), and I'm pretty sue we will add this into our build.

        Show
        elecharny Emmanuel Lecharny added a comment - Uwe, FTR, the Calendar.createCalendar class returns a GregorianCalendar when the Locale is set to Locale.ENGLISH. Anyway, I'll change the code accordingly. Btw, interesting reading (Policeman-tools), and I'm pretty sue we will add this into our build.
        Hide
        thetaphi Uwe Schindler added a comment - - edited

        What does the Locale has to do how strings are encoded? The code changes nothing it was fine before and after.

        The problem is that your calendar is wrong. The above locales simply don't use gregorian calendars.

        My suggestion: Use SimpleDateFormat and pass Locale = Locale.ENGLISH, also use a GregorianCalendar.

        In addition, when parsing date times in this format, you also have to give the correct Locale. E.g., Thai locales don't use ASCII digits, so a date with ASCII digits cannot be parsed by those parsers, unless you initialize the date parser with ENGLISH or ROOT locale.

        I would also suggest to use the forbidden-apis Maven/Ant module that checks for all signatures in the JDK that rely on some external (unwanted) defaults like charsets, locales or timezones: https://github.com/policeman-tools/forbidden-apis

        Lucene/Solr/Elasticsearch/Tika use that module. See also the blog post about it why you should use it.

        Show
        thetaphi Uwe Schindler added a comment - - edited What does the Locale has to do how strings are encoded? The code changes nothing it was fine before and after. The problem is that your calendar is wrong. The above locales simply don't use gregorian calendars. My suggestion: Use SimpleDateFormat and pass Locale = Locale.ENGLISH, also use a GregorianCalendar. In addition, when parsing date times in this format, you also have to give the correct Locale. E.g., Thai locales don't use ASCII digits, so a date with ASCII digits cannot be parsed by those parsers, unless you initialize the date parser with ENGLISH or ROOT locale. I would also suggest to use the forbidden-apis Maven/Ant module that checks for all signatures in the JDK that rely on some external (unwanted) defaults like charsets, locales or timezones: https://github.com/policeman-tools/forbidden-apis Lucene/Solr/Elasticsearch/Tika use that module. See also the blog post about it why you should use it.
        Hide
        elecharny Emmanuel Lecharny added a comment -

        I have changed the code so that it always return an UTF-8 Strings. See http://svn.apache.org/r1665445

        Show
        elecharny Emmanuel Lecharny added a comment - I have changed the code so that it always return an UTF-8 Strings. See http://svn.apache.org/r1665445
        Hide
        gchanan Gregory Chanan added a comment -

        Thanks for following up Emmanuel Lecharny.

        As I mentioned in my comment on SOLR-6915, the problem with ja_JP_JP_#u-ca-japanese appears to be coming from bouncycastle, so I may have been mistaken in listing it here.

        The original errors I posted in SOLR-6915 are from M15, but the output I posted above are from M28 and look identical to M15. I don't have an isolated test case. Am I correct in assuming the output for th_TH_TH_#u-nu-thai and २०१५०१२७०१०८०६.०४०Z are buggy because they don't follow the usual format?

        Show
        gchanan Gregory Chanan added a comment - Thanks for following up Emmanuel Lecharny . As I mentioned in my comment on SOLR-6915 , the problem with ja_JP_JP_#u-ca-japanese appears to be coming from bouncycastle, so I may have been mistaken in listing it here. The original errors I posted in SOLR-6915 are from M15, but the output I posted above are from M28 and look identical to M15. I don't have an isolated test case. Am I correct in assuming the output for th_TH_TH_#u-nu-thai and २०१५०१२७०१०८०६.०४०Z are buggy because they don't follow the usual format?
        Hide
        elecharny Emmanuel Lecharny added a comment -

        Do you have an isolated test case for toGeneralizedTime() ?

        Also which version of ApacheDS are you using ?

        Show
        elecharny Emmanuel Lecharny added a comment - Do you have an isolated test case for toGeneralizedTime() ? Also which version of ApacheDS are you using ?

          People

          • Assignee:
            Unassigned
            Reporter:
            gchanan Gregory Chanan
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development