Harmony
  1. Harmony
  2. HARMONY-1111

[classlib][lang] unexpected IllegalArgumentException for String(byte[], int, int, String)

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Classlib
    • Labels:
      None

      Description

      The Harmony method java.lang.String(byte[] bytes, int offset, int length, String charsetName) throws IllegalArgumentException while RI creates object.

      ==================== test.java =======================
      import java.nio.charset.Charset;

      public class test {
      public static void main (String [] args) throws Exception {
      byte [] b = new byte[256];
      for (int i = 0; i < b.length; i++)

      { b[i] = (byte)i; }

      System.out.println("res = " + new String(b, 170, 30, Charset.forName("UTF-8").name()));
      }
      }
      ==================================================

      Output:
      C:\tmp\tmp17>C:\jrockit-jdk1.5.0-windows-ia32\bin\java.exe -cp . -showversion test
      java version "1.5.0"
      Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0-b64)
      BEA WebLogic JRockit(R) (build dra-38972-20041208-2001-win-ia32, R25.0.0-75, GC: System optimized over throughput (initial strategy singleparpar))

      res = ??????????????????????????????

      C:\tmp\tmp17>C:\harmony\classlib1.5\deploy\jdk\jre\bin\java.exe -cp . -showversion test
      java version 1.5 (subset)

      (c) Copyright 1991, 2006 The Apache Software Foundation or its licensors, as applicable.
      Exception in thread "main" java.lang.IllegalArgumentException: The length must be positive.
      at java.nio.charset.CoderResult.malformedForLength(CoderResult.java:152)
      at com.ibm.icu4jni.charset.CharsetDecoderICU.implFlush(Unknown Source)
      at java.nio.charset.CharsetDecoder.flush(CharsetDecoder.java:571)
      at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:254)
      at java.nio.charset.Charset.decode(Charset.java:690)
      at java.lang.String.<init>(String.java:352)
      at test.main(test.java:9)

      1. CoderResult.patch
        0.6 kB
        Vladimir Ivanov
      2. String.patch
        0.9 kB
        Vladimir Ivanov
      3. String2Test.patch
        1.0 kB
        Vladimir Ivanov

        Activity

        Hide
        Vladimir Ivanov added a comment -

        patch

        Show
        Vladimir Ivanov added a comment - patch
        Hide
        Mark Hindess added a comment -

        I don't think this is the correct fix. With this fix, the following line of code:

        java.nio.charset.CoderResult.malformedForLength(0)

        no longer throws an IllegalArgumentException even though the RI does throw such an exception.

        Show
        Mark Hindess added a comment - I don't think this is the correct fix. With this fix, the following line of code: java.nio.charset.CoderResult.malformedForLength(0) no longer throws an IllegalArgumentException even though the RI does throw such an exception.
        Hide
        Vladimir Ivanov added a comment -

        I'll see on it.

        Show
        Vladimir Ivanov added a comment - I'll see on it.
        Hide
        Vladimir Ivanov added a comment -

        The root reason of this issue is different behavior RI vs Harmony to decode bytes from the range 0xC0-0xFF (according to the JVMS it is 2 and 3 bytes symbols in the UTF-8 strings).
        According to the API spec for this constructor:
        "The behavior of this constructor when the given bytes are not valid in the given charset is unspecified"

        So, for invalid array of bytes RI returns '0x3f' while Harmony returns '0x1a' or throws exception.
        Proposed patch is invalid. Seems, the String constructor should be fixed.

        ============ test.java =====================
        import java.nio.charset.Charset;

        public class test {
        public static void main (String [] args) throws Exception {
        byte[] b =

        {(byte)0xBF}

        ;
        String str = new String(b, 0, 1, Charset.forName("UTF-8").name());
        b = str.getBytes();
        System.out.print("0x");
        for (int i = 0; i < b.length; i++)

        { System.out.print(Integer.toHexString(b[i])); }

        System.out.println("");
        byte[] b1 =

        {(byte)0xC0}

        ;
        str = new String(b1, 0, 1, Charset.forName("UTF-8").name());
        b1 = str.getBytes();
        System.out.print("0x");
        for (int i = 0; i < b1.length; i++)

        { System.out.print(Integer.toHexString(b1[i])); }

        }
        }
        ========================================

        Output:
        C:\tmp\tmp17>C:\jrockit-jdk1.5.0-windows-ia32\bin\java.exe -cp . -showversion test
        java version "1.5.0"
        Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0-b64)
        BEA WebLogic JRockit(R) (build dra-38972-20041208-2001-win-ia32, R25.0.0-75, GC: System optimized over throughput (initial strategy singleparpar))

        0x3f
        0x3f
        C:\tmp\tmp17>C:\harmony\classlib1.5\deploy\jdk\jre\bin\java.exe -cp . -showversion test
        java version 1.5 (subset)

        (c) Copyright 1991, 2006 The Apache Software Foundation or its licensors, as applicable.
        0x1a
        Exception in thread "main" java.lang.IllegalArgumentException: The length must be positive.
        at java.nio.charset.CoderResult.malformedForLength(CoderResult.java:152)
        at com.ibm.icu4jni.charset.CharsetDecoderICU.implFlush(Unknown Source)
        at java.nio.charset.CharsetDecoder.flush(CharsetDecoder.java:571)
        at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:254)
        at java.nio.charset.Charset.decode(Charset.java:727)
        at java.lang.String.<init>(String.java:361)
        at test.main(test.java:14)

        Show
        Vladimir Ivanov added a comment - The root reason of this issue is different behavior RI vs Harmony to decode bytes from the range 0xC0-0xFF (according to the JVMS it is 2 and 3 bytes symbols in the UTF-8 strings). According to the API spec for this constructor: "The behavior of this constructor when the given bytes are not valid in the given charset is unspecified" So, for invalid array of bytes RI returns '0x3f' while Harmony returns '0x1a' or throws exception. Proposed patch is invalid. Seems, the String constructor should be fixed. ============ test.java ===================== import java.nio.charset.Charset; public class test { public static void main (String [] args) throws Exception { byte[] b = {(byte)0xBF} ; String str = new String(b, 0, 1, Charset.forName("UTF-8").name()); b = str.getBytes(); System.out.print("0x"); for (int i = 0; i < b.length; i++) { System.out.print(Integer.toHexString(b[i])); } System.out.println(""); byte[] b1 = {(byte)0xC0} ; str = new String(b1, 0, 1, Charset.forName("UTF-8").name()); b1 = str.getBytes(); System.out.print("0x"); for (int i = 0; i < b1.length; i++) { System.out.print(Integer.toHexString(b1[i])); } } } ======================================== Output: C:\tmp\tmp17>C:\jrockit-jdk1.5.0-windows-ia32\bin\java.exe -cp . -showversion test java version "1.5.0" Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0-b64) BEA WebLogic JRockit(R) (build dra-38972-20041208-2001-win-ia32, R25.0.0-75, GC: System optimized over throughput (initial strategy singleparpar)) 0x3f 0x3f C:\tmp\tmp17>C:\harmony\classlib1.5\deploy\jdk\jre\bin\java.exe -cp . -showversion test java version 1.5 (subset) (c) Copyright 1991, 2006 The Apache Software Foundation or its licensors, as applicable. 0x1a Exception in thread "main" java.lang.IllegalArgumentException: The length must be positive. at java.nio.charset.CoderResult.malformedForLength(CoderResult.java:152) at com.ibm.icu4jni.charset.CharsetDecoderICU.implFlush(Unknown Source) at java.nio.charset.CharsetDecoder.flush(CharsetDecoder.java:571) at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:254) at java.nio.charset.Charset.decode(Charset.java:727) at java.lang.String.<init>(String.java:361) at test.main(test.java:14)
        Hide
        Vladimir Ivanov added a comment -

        updated patch.
        Now this constructor returns '0x3f' for bytes 0xC0-0xFF. May be it should be changed to '0x1a'

        Output on fixed version:
        C:\tmp\tmp17>C:\jrockit-jdk1.5.0-windows-ia32\bin\java.exe -cp . -showversion test
        java version "1.5.0"
        Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0-b64)
        BEA WebLogic JRockit(R) (build dra-38972-20041208-2001-win-ia32, R25.0.0-75, GC: System optimized over throughput (initial strategy singleparpar))

        0x3f
        0x3f
        C:\tmp\tmp17>C:\harmony\classlib1.5\deploy\jdk\jre\bin\java.exe -cp . -showversion test
        java version 1.5 (subset)

        (c) Copyright 1991, 2006 The Apache Software Foundation or its licensors, as applicable.
        0x1a
        0x3f

        Show
        Vladimir Ivanov added a comment - updated patch. Now this constructor returns '0x3f' for bytes 0xC0-0xFF. May be it should be changed to '0x1a' Output on fixed version: C:\tmp\tmp17>C:\jrockit-jdk1.5.0-windows-ia32\bin\java.exe -cp . -showversion test java version "1.5.0" Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0-b64) BEA WebLogic JRockit(R) (build dra-38972-20041208-2001-win-ia32, R25.0.0-75, GC: System optimized over throughput (initial strategy singleparpar)) 0x3f 0x3f C:\tmp\tmp17>C:\harmony\classlib1.5\deploy\jdk\jre\bin\java.exe -cp . -showversion test java version 1.5 (subset) (c) Copyright 1991, 2006 The Apache Software Foundation or its licensors, as applicable. 0x1a 0x3f
        Hide
        Vladimir Ivanov added a comment -

        unit test

        Show
        Vladimir Ivanov added a comment - unit test
        Hide
        Mark Hindess added a comment -

        Applied in r470381. Please confirm it has been applied as expected.

        Show
        Mark Hindess added a comment - Applied in r470381. Please confirm it has been applied as expected.
        Hide
        Vladimir Ivanov added a comment -

        verified, thanks

        Show
        Vladimir Ivanov added a comment - verified, thanks
        Hide
        Mark Hindess added a comment -

        Verified by Vladimir.

        Show
        Mark Hindess added a comment - Verified by Vladimir.

          People

          • Assignee:
            Mark Hindess
            Reporter:
            Vladimir Ivanov
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development