Derby
  1. Derby
  2. DERBY-244

with linux, depending on env setting $LANG and console encoding, some i18n tests fail

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 10.1.1.0
    • Fix Version/s: 10.2.1.6
    • Component/s: Test
    • Labels:
      None
    • Environment:
      Linux, with console.encoding *not* UTF-8

      Description

      The tests
      i18n/messageLocale.sql
      i18n/urlLocale.sql
      i18n/iepnegativetests_ES.sql

      will fail on Linux if $LANG and as a result, console.encoding is not set in the same way as when the test master was created. The behavior is that some characters are not seen as outside the ANSI range and are displayed as a ?.
      Result is as master when $LANG is en_US.UTF-8

      But then ieptest.sql will fail which will with ibm142 which pass if $LANG is en_US.

      This needs some further analysis, so this description may need to be updated later.
      Whatever the solution is, will need to work for all situations.

      1. 244.diff
        6 kB
        Knut Anders Hatlen
      2. 244.stat
        0.2 kB
        Knut Anders Hatlen
      3. 244-2.diff
        7 kB
        Knut Anders Hatlen
      4. 244-2.stat
        0.2 kB
        Knut Anders Hatlen
      5. 244-3.diff
        8 kB
        Knut Anders Hatlen

        Issue Links

          Activity

          Myrna van Lunteren created issue -
          Myrna van Lunteren made changes -
          Field Original Value New Value
          Type Test [ 6 ] Bug [ 1 ]
          Knut Anders Hatlen made changes -
          Link This issue is related to DERBY-323 [ DERBY-323 ]
          Hide
          Knut Anders Hatlen added a comment -

          I am stealing this bug from Myrna.

          Show
          Knut Anders Hatlen added a comment - I am stealing this bug from Myrna.
          Knut Anders Hatlen made changes -
          Assignee Myrna van Lunteren [ myrna ] Knut Anders Hatlen [ knutanders ]
          Knut Anders Hatlen made changes -
          Status Open [ 1 ] In Progress [ 3 ]
          Hide
          Knut Anders Hatlen added a comment -

          Attaching a patch which makes

          • the i18n tests run with -Dfile.encoding=UTF-8
          • the streams in ProcessStreamResult use UTF-8 encoding for the i18n tests
          • Sed.java read result files from i18n tests using UTF-8 encoding

          Derbyall runs cleanly with the patch (en_US locale). The i18nTest suite runs cleanly with non-UTF locales. The patch is ready for review. Thanks!

          Show
          Knut Anders Hatlen added a comment - Attaching a patch which makes the i18n tests run with -Dfile.encoding=UTF-8 the streams in ProcessStreamResult use UTF-8 encoding for the i18n tests Sed.java read result files from i18n tests using UTF-8 encoding Derbyall runs cleanly with the patch (en_US locale). The i18nTest suite runs cleanly with non-UTF locales. The patch is ready for review. Thanks!
          Knut Anders Hatlen made changes -
          Attachment 244.stat [ 12336998 ]
          Attachment 244.diff [ 12336997 ]
          Knut Anders Hatlen made changes -
          Derby Info [Patch Available]
          Hide
          Knut Anders Hatlen added a comment -

          I'm removing the "patch available" flag since the patch does not seem to work with IBM's JVM. I think the console.encoding has to be set as well.

          Show
          Knut Anders Hatlen added a comment - I'm removing the "patch available" flag since the patch does not seem to work with IBM's JVM. I think the console.encoding has to be set as well.
          Knut Anders Hatlen made changes -
          Derby Info [Patch Available]
          Hide
          Myrna van Lunteren added a comment -

          Thx Knut, for testing with IBM jvm...

          I have some tidbits of half-researched information that may/may not be related, this is just a memory dump:

          • I noticed that when you're on Unix and $LANG is set to en_US.UTF-8, there is a difference in what file.encoding gets set to. With sun's jvm file encoding remains ISO-8559-1, but IBM's jvms pick up UTF-8. (I run a trusty little java program that just prints out all system properties).
          • As a result, I've been running our tests with ibm jvms with $LANG set to en_US. (which is problematic on zOS, where $LANG appears to indicate a programming language, and en_US is just not valid for $LANG.)
          • I think there actually may be an odd kind of bug in ij when the file.encoding is UTF-8. I want to research that further.
          • Note also that RunTest does try to do something with the console encoding in certain cases.
          • I have been thinking that maybe if we can figure out why setting the encoding only really works with sun jvm 1.5, we could maybe use derbyTesting.encoding for these tests. Andrew had logged a bug 1027, but the actual problem may not be with the test harness, see Deepa's (Feb 7 2006) comments in re DERBY-683.
          • part of all this is also to include the i18n/LocalizedDisplay.sql and LocalizedConnectionAttributes.sql tests into a suite at some point.
          • I tried to make the test harness hide some of the problems by wrapping the non-ascii characters in >Enc ### < lines. But I think that never worked properly.
          Show
          Myrna van Lunteren added a comment - Thx Knut, for testing with IBM jvm... I have some tidbits of half-researched information that may/may not be related, this is just a memory dump: I noticed that when you're on Unix and $LANG is set to en_US.UTF-8, there is a difference in what file.encoding gets set to. With sun's jvm file encoding remains ISO-8559-1, but IBM's jvms pick up UTF-8. (I run a trusty little java program that just prints out all system properties). As a result, I've been running our tests with ibm jvms with $LANG set to en_US. (which is problematic on zOS, where $LANG appears to indicate a programming language, and en_US is just not valid for $LANG.) I think there actually may be an odd kind of bug in ij when the file.encoding is UTF-8. I want to research that further. Note also that RunTest does try to do something with the console encoding in certain cases. I have been thinking that maybe if we can figure out why setting the encoding only really works with sun jvm 1.5, we could maybe use derbyTesting.encoding for these tests. Andrew had logged a bug 1027, but the actual problem may not be with the test harness, see Deepa's (Feb 7 2006) comments in re DERBY-683 . part of all this is also to include the i18n/LocalizedDisplay.sql and LocalizedConnectionAttributes.sql tests into a suite at some point. I tried to make the test harness hide some of the problems by wrapping the non-ascii characters in >Enc ### < lines. But I think that never worked properly.
          Hide
          Knut Anders Hatlen added a comment -

          Thanks for your comments, Myrna. The issue was that the test harness sets console.encoding to Cp1252. I think Sun's jvm ignores this property, but IBM's jvm used it instead of file.encoding. Since I had changed ProcessStreamResult to expect UTF-8 input when the tests were run with file.encoding=UTF-8, it crashed (silently) with a MalformedInputException.

          Show
          Knut Anders Hatlen added a comment - Thanks for your comments, Myrna. The issue was that the test harness sets console.encoding to Cp1252. I think Sun's jvm ignores this property, but IBM's jvm used it instead of file.encoding. Since I had changed ProcessStreamResult to expect UTF-8 input when the tests were run with file.encoding=UTF-8, it crashed (silently) with a MalformedInputException.
          Hide
          Knut Anders Hatlen added a comment -

          Attaching new patch (244-2.diff). The only change from the previous patch is that it uses -Dconsole.encoding=UTF-8 instead of -Dconsole.encoding=Cp1252 for the i18n tests.

          Derbyall runs cleanly with the patch on Sun JVM 1.5.0. There were a couple of failures on IBM JVM 1.5.0 (ran with LC_ALL and LANG set to ja_JP.eucjp), but they were already logged in JIRA. I have also run the i18nTest suite successfully on IBM JVM 1.4.2.

          Reviews would be welcome! Thanks.

          Show
          Knut Anders Hatlen added a comment - Attaching new patch (244-2.diff). The only change from the previous patch is that it uses -Dconsole.encoding=UTF-8 instead of -Dconsole.encoding=Cp1252 for the i18n tests. Derbyall runs cleanly with the patch on Sun JVM 1.5.0. There were a couple of failures on IBM JVM 1.5.0 (ran with LC_ALL and LANG set to ja_JP.eucjp), but they were already logged in JIRA. I have also run the i18nTest suite successfully on IBM JVM 1.4.2. Reviews would be welcome! Thanks.
          Knut Anders Hatlen made changes -
          Attachment 244-2.stat [ 12337618 ]
          Attachment 244-2.diff [ 12337617 ]
          Knut Anders Hatlen made changes -
          Derby Info [Patch Available]
          Hide
          Myrna van Lunteren added a comment -

          I think the patch is an improvement. I did not run derbyall, but I ran i18nTest on Linux & Windows with ibm and sun jvms, and now get the same behavior with iepnegatieve_ES, urlLocale and messageLocale.sql tests.

          The one remark I have is that I still cannot get the LocalizedDisplay.sql and LocalizedConnectionAttribute.sql test from the i18n directory to behave the same under windows and Linux (with sun jdk 1.4.2.).
          For windows, I had to update the masters for these tests, but running them on Linux still failed for me.
          With jdk131, ibm131 and ibm142 the LocalizedDisplay.sql test hung, and LocalizedConnectionAttribute exits with a MalformedInputException.
          It would be nice if we could figure out a way to add these tests to the suites...

          — stack of LocalizedConnectionAttribute on Linux —
          Exception in thread "main" sun.io.MalformedInputException
          at sun.io.ByteToCharUTF8.convert(ByteToCharUTF8.java(Compiled Code))
          at sun.nio.cs.StreamDecoder$ConverterSD.convertInto(StreamDecoder.java:287)
          at sun.nio.cs.StreamDecoder$ConverterSD.implRead(StreamDecoder.java:337)
          at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:223)
          at java.io.InputStreamReader.read(InputStreamReader.java:208)
          at java.io.BufferedReader.fill(BufferedReader.java:153)
          at java.io.BufferedReader.readLine(BufferedReader.java:316)
          at java.io.BufferedReader.readLine(BufferedReader.java:379)
          at org.apache.derbyTesting.functionTests.harness.RunTest.setDirectories(RunTest.java:729)
          at org.apache.derbyTesting.functionTests.harness.RunTest.main(RunTest.java:262)
          ----------------------------------------------------------------------------

          Show
          Myrna van Lunteren added a comment - I think the patch is an improvement. I did not run derbyall, but I ran i18nTest on Linux & Windows with ibm and sun jvms, and now get the same behavior with iepnegatieve_ES, urlLocale and messageLocale.sql tests. The one remark I have is that I still cannot get the LocalizedDisplay.sql and LocalizedConnectionAttribute.sql test from the i18n directory to behave the same under windows and Linux (with sun jdk 1.4.2.). For windows, I had to update the masters for these tests, but running them on Linux still failed for me. With jdk131, ibm131 and ibm142 the LocalizedDisplay.sql test hung, and LocalizedConnectionAttribute exits with a MalformedInputException. It would be nice if we could figure out a way to add these tests to the suites... — stack of LocalizedConnectionAttribute on Linux — Exception in thread "main" sun.io.MalformedInputException at sun.io.ByteToCharUTF8.convert(ByteToCharUTF8.java(Compiled Code)) at sun.nio.cs.StreamDecoder$ConverterSD.convertInto(StreamDecoder.java:287) at sun.nio.cs.StreamDecoder$ConverterSD.implRead(StreamDecoder.java:337) at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:223) at java.io.InputStreamReader.read(InputStreamReader.java:208) at java.io.BufferedReader.fill(BufferedReader.java:153) at java.io.BufferedReader.readLine(BufferedReader.java:316) at java.io.BufferedReader.readLine(BufferedReader.java:379) at org.apache.derbyTesting.functionTests.harness.RunTest.setDirectories(RunTest.java:729) at org.apache.derbyTesting.functionTests.harness.RunTest.main(RunTest.java:262) ----------------------------------------------------------------------------
          Hide
          Rick Hillegas added a comment -

          Hi Myrna,

          Are you recommending that we commit this patch but that additional work be put into analyzing LocalizedDisplay.sql and LocalizedConnectionAttribute.sql?

          Show
          Rick Hillegas added a comment - Hi Myrna, Are you recommending that we commit this patch but that additional work be put into analyzing LocalizedDisplay.sql and LocalizedConnectionAttribute.sql?
          Hide
          Myrna van Lunteren added a comment -

          yes, that's exactly what I am proposing; unless someone has other ideas...

          Show
          Myrna van Lunteren added a comment - yes, that's exactly what I am proposing; unless someone has other ideas...
          Hide
          Rick Hillegas added a comment -

          I'm afraid I see diffs in these tests when run against jar files on Linux and jdk1.4:

          urlLocale
          messageLocale
          iepnegativetests_ES

          Show
          Rick Hillegas added a comment - I'm afraid I see diffs in these tests when run against jar files on Linux and jdk1.4: urlLocale messageLocale iepnegativetests_ES
          Hide
          Rick Hillegas added a comment -

          Hi Myrna,

          On my linux machine $LANG is "en_US.ISO-8859-1" and I see the following diffs in the urlLocale test after applying this patch:

          21 del
          < ERROR 08006: Arr EnC:>234< t de la base de donn EnC:>233< es 'swissdb'.
          21a21
          > ERROR 08006: Arr EnC:>65533< t de la base de donn EnC:>65533< es 'swissdb'.
          28 del
          < ERROR 08006: Arr EnC:>234< t de la base de donn EnC:>233< es 'swissdb'.
          28a28
          > ERROR 08006: Arr EnC:>65533< t de la base de donn EnC:>65533< es 'swissdb'.
          Test Failed.

              • End: urlLocale jdk1.4.2_08 2006-08-04 08:09:58 ***
          Show
          Rick Hillegas added a comment - Hi Myrna, On my linux machine $LANG is "en_US.ISO-8859-1" and I see the following diffs in the urlLocale test after applying this patch: 21 del < ERROR 08006: Arr EnC:>234< t de la base de donn EnC:>233< es 'swissdb'. 21a21 > ERROR 08006: Arr EnC:>65533< t de la base de donn EnC:>65533< es 'swissdb'. 28 del < ERROR 08006: Arr EnC:>234< t de la base de donn EnC:>233< es 'swissdb'. 28a28 > ERROR 08006: Arr EnC:>65533< t de la base de donn EnC:>65533< es 'swissdb'. Test Failed. End: urlLocale jdk1.4.2_08 2006-08-04 08:09:58 ***
          Hide
          Knut Anders Hatlen added a comment -

          Thanks for the reviews, Myrna and Rick!

          I also see that the tests fail with Sun JVM 1.4.2 under Linux when
          LANG=en_US.ISO-8859-1. I don't know why, but it seems to work when
          derby.ui.codeset is UTF-8.

          I am uploading a new patch (244-3.diff) which has these extra changes
          in RunTest.java:

          • adds a new variable, codeset, which holds the value of
            derby.ui.codeset if it is specified by the test in
            XXX_app.properties
          • sets derby.ui.codeset to UTF-8 in i18n tests for which no codeset
            has been specified
          • uses the encoding specified by derby.ui.codeset to read the output
            from the test (but it is still written as UTF-8 to the tmp file)

          This seems to fix all the tests in the i18nTest suite for all (LANG,
          VENDOR, VERSION) in (en_US.ISO-8859-1, en_US.UTF-8) x (Sun, IBM) x
          (1.4.2, 1.5.0) under Linux.

          It is also possible that this patch makes LocalizedDisplay.sql have
          the same behaviour on Windows and Linux, but I don't have any machine
          running Windows to test it on. (I never saw the hang, by the way.)

          LocalizedConnectionAttribute.sql fails because RunTest expects all sql
          files to be UTF-8 encoded, while LocalizedConnectionAttribute.sql is
          Cp850 encoded. I think this can be solved by using
          InputStream/OutputStream instead of Reader/Writer to copy the sql file
          from derbyTesting.jar to the test directory, but I won't add that to
          this patch as it is messy enough as it is.

          Show
          Knut Anders Hatlen added a comment - Thanks for the reviews, Myrna and Rick! I also see that the tests fail with Sun JVM 1.4.2 under Linux when LANG=en_US.ISO-8859-1. I don't know why, but it seems to work when derby.ui.codeset is UTF-8. I am uploading a new patch (244-3.diff) which has these extra changes in RunTest.java: adds a new variable, codeset, which holds the value of derby.ui.codeset if it is specified by the test in XXX_app.properties sets derby.ui.codeset to UTF-8 in i18n tests for which no codeset has been specified uses the encoding specified by derby.ui.codeset to read the output from the test (but it is still written as UTF-8 to the tmp file) This seems to fix all the tests in the i18nTest suite for all (LANG, VENDOR, VERSION) in (en_US.ISO-8859-1, en_US.UTF-8) x (Sun, IBM) x (1.4.2, 1.5.0) under Linux. It is also possible that this patch makes LocalizedDisplay.sql have the same behaviour on Windows and Linux, but I don't have any machine running Windows to test it on. (I never saw the hang, by the way.) LocalizedConnectionAttribute.sql fails because RunTest expects all sql files to be UTF-8 encoded, while LocalizedConnectionAttribute.sql is Cp850 encoded. I think this can be solved by using InputStream/OutputStream instead of Reader/Writer to copy the sql file from derbyTesting.jar to the test directory, but I won't add that to this patch as it is messy enough as it is.
          Knut Anders Hatlen made changes -
          Attachment 244-3.diff [ 12338828 ]
          Hide
          Knut Anders Hatlen added a comment -

          Committed revision 432645. The problems with LocalizedDisplay and LocalizedConnectionAttribute have been logged as DERBY-1726, so I'm marking this issue as resolved.

          Show
          Knut Anders Hatlen added a comment - Committed revision 432645. The problems with LocalizedDisplay and LocalizedConnectionAttribute have been logged as DERBY-1726 , so I'm marking this issue as resolved.
          Knut Anders Hatlen made changes -
          Fix Version/s 10.2.1.0 [ 11187 ]
          Status In Progress [ 3 ] Resolved [ 5 ]
          Derby Info [Patch Available]
          Resolution Fixed [ 1 ]
          Myrna van Lunteren made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Kathey Marsden made changes -
          Rank Ranked higher
          Gavin made changes -
          Workflow jira [ 42107 ] Default workflow, editable Closed status [ 12802560 ]
          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Open Open In Progress In Progress
          443d 7h 32m 1 Knut Anders Hatlen 16/Jul/06 12:21
          In Progress In Progress Resolved Resolved
          33d 6h 10m 1 Knut Anders Hatlen 18/Aug/06 18:32
          Resolved Resolved Closed Closed
          214d 3h 37m 1 Myrna van Lunteren 20/Mar/07 21:09

            People

            • Assignee:
              Knut Anders Hatlen
              Reporter:
              Myrna van Lunteren
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development