Derby
  1. Derby
  2. DERBY-4508

ij on slavic machine does not create files with appropriate encoding

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Won't Fix
    • Affects Version/s: 10.5.3.0
    • Fix Version/s: None
    • Component/s: Tools
    • Labels:
      None
    • Environment:
      PC (windows Server 2007) with codepage Cp852 (console.encoding is Cp852)
    • Issue & fix info:
      Repro attached, Workaround attached

      Description

      On a machine configured with slavic codeset Cp852 ij does not always create (or access) files with appropriate characters.

      For instance, consider the following string with non-ascii characters: Českýnázev

      When starting ij using only defaults: java org.apache.derby.tools.ij (or using one of the ij scripts from bin) and issueing the following connect statement:
      ij> connect 'jdbc:derby:Českýnázev;create=true';
      ij creates a database that on the OS shows like so: ¬eskěn zev

      When I have a file - e.g. simple.sql - that does this connect, the file created on the OS has the expected name (Českýnázev).

      Using a simple program that does the same as org.apache.derby.iapi.tools.i18n.LocalizedInput
      (only 1 method, main), with appropriate imports):
      -----------------------
      public static void main(String[] args) throws Exception

      { InputStreamReader isr = new InputStreamReader(System.in); BufferedReader in = new BufferedReader(isr); String inputString = in.readLine(); System.out.println("inputString: " + inputString); File f = new File(inputString); f.createNewFile(); System.out.println("created a file called " + inputString); in.close(); isr.close(); }

      ----------------------------------------
      Sun's jdk 1.6 gives the following output:
      ------------------
      Českýnázev
      inputString: Českýnázev
      created a file called Českýnázev
      ------------------
      While IBM's jdk 1.6 does this:
      ------------------
      Českýnázev
      inputString: ¬eskěn zev
      created a file called ¬eskěn zev
      ------------------
      However in both cases the file created on the OS (dos prompt, windows explorer) looks like the way ij created the database dir:
      ¬eskěn zev

      If we specify -Dfile.encoding=Cp852, or -Dderby.ui.codeset=Cp852 when starting ij, the file created has the expected name, so this is a workaround.

      1. GarbledFilename.java
        0.7 kB
        Myrna van Lunteren

        Activity

        Hide
        Myrna van Lunteren added a comment -

        After some more experiments and further checking, it seems that the only information available to the jvm relating to the codepage/chcp of the OS is the IBM jvm specific console.encoding.

        I'm changing this to an enhancement request - perhaps we can modify the ij scripts so that we set -Dderby.ui.codeset gets set in accordance with the codepage on the OS by default.

        Show
        Myrna van Lunteren added a comment - After some more experiments and further checking, it seems that the only information available to the jvm relating to the codepage/chcp of the OS is the IBM jvm specific console.encoding. I'm changing this to an enhancement request - perhaps we can modify the ij scripts so that we set -Dderby.ui.codeset gets set in accordance with the codepage on the OS by default.
        Hide
        Myrna van Lunteren added a comment -

        attaching the simple repro

        Show
        Myrna van Lunteren added a comment - attaching the simple repro
        Hide
        Myrna van Lunteren added a comment -

        I believe the user that ran into this has implemented a work-around (using another script than ij and passing in the encoding values required).
        A problem I had noted from my tests is that the behavior was different when running with IBM jvms vs. Sun (at the time) jvms - with the IBM jvms it was necessary to set -Dfile.encoding, so there may be a jvm incompatibility issue here too which would make a solution that always works complicated.
        I'm now closing this as won't fix.

        Show
        Myrna van Lunteren added a comment - I believe the user that ran into this has implemented a work-around (using another script than ij and passing in the encoding values required). A problem I had noted from my tests is that the behavior was different when running with IBM jvms vs. Sun (at the time) jvms - with the IBM jvms it was necessary to set -Dfile.encoding, so there may be a jvm incompatibility issue here too which would make a solution that always works complicated. I'm now closing this as won't fix.

          People

          • Assignee:
            Unassigned
            Reporter:
            Myrna van Lunteren
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development