Directory Studio
  1. Directory Studio
  2. DIRSTUDIO-802

Confusion between ISO-8859-1 and UTF-8 in the Encode/Decode dialog

    Details

      Description

      The encode/decode tool (from the LDAP menu) gives "aGVydsOp" for BASE-64 encoding from the ISO-8859-1 string "hervé".
      while the website http://www.base64decode.org/ gives the same results from the same string but in UTF-8.
      Also UTF-8 BASE 64 encoding of Apache Directory Studio match with ISO-8859-1 BASE64 encoding of the previous website.
      The result from my own java code match with that website.

      I think there is a confusion between ISO-8859-1 and UTF-8 in the encode/decode LDAP gui tool of Apache Directory Studio.
      It will be easy to fix.

      1. encode-decode LDAP demo 2.PNG
        10 kB
        julien2512
      2. encode-decode LDAP demo 1.PNG
        10 kB
        julien2512

        Issue Links

          Activity

          Hide
          Emmanuel Lecharny added a comment -

          > The encode/decode tool (from the LDAP menu) gives "aGVydsOp" for BASE-64 encoding from the ISO-8859-1 string "hervé".

          Which is expected. "hervé" when encoded using UTF-8 should be converted to aGVydsOp when base 64 encoded.

          Every String in LDAP is supposed to be UTF-8.

          The base 64 encoding for "hervé" encoded in ISO-8859-1 would be aGVyduk=

          > while the website http://www.base64decode.org/

          Can't access it.

          Give www.motobit.com/util/base64-decoder-encoder.asp a try.

          > gives the same results from the same string but in UTF-8.
          > Also UTF-8 BASE 64 encoding of Apache Directory Studio match with ISO-8859-1 BASE64 encoding of the previous website.

          No, not if the web site I provided, AFAICT

          > The result from my own java code match with that website.
          Can you provide your code ?

          Show
          Emmanuel Lecharny added a comment - > The encode/decode tool (from the LDAP menu) gives "aGVydsOp" for BASE-64 encoding from the ISO-8859-1 string "hervé". Which is expected. "hervé" when encoded using UTF-8 should be converted to aGVydsOp when base 64 encoded. Every String in LDAP is supposed to be UTF-8. The base 64 encoding for "hervé" encoded in ISO-8859-1 would be aGVyduk= > while the website http://www.base64decode.org/ Can't access it. Give www.motobit.com/util/base64-decoder-encoder.asp a try. > gives the same results from the same string but in UTF-8. > Also UTF-8 BASE 64 encoding of Apache Directory Studio match with ISO-8859-1 BASE64 encoding of the previous website. No, not if the web site I provided, AFAICT > The result from my own java code match with that website. Can you provide your code ?
          Hide
          julien2512 added a comment - - edited

          Look at these two files showing the conversion result in my Apache Directory Studio 1.5.3.v20100330 version.

          hervé in ISO-8859-1 => aGVydsOp
          hervé in UTF-8 => aGVyduk=

          What do you thing of ?

          I'll carry my code later (if necessary), i have to work ; ).

          Show
          julien2512 added a comment - - edited Look at these two files showing the conversion result in my Apache Directory Studio 1.5.3.v20100330 version. hervé in ISO-8859-1 => aGVydsOp hervé in UTF-8 => aGVyduk= What do you thing of ? I'll carry my code later (if necessary), i have to work ; ).
          Hide
          Emmanuel Lecharny added a comment -

          Here is what I get when I use this snippet of code :

          @Test
          public void testUtf8Base64() throws UnsupportedEncodingException

          { String name = new String( "herv\u00e9" ); // Hervé in Unicode byte[] utf8 = Strings.getBytesUtf8( name ); System.out.println( Strings.dumpBytes( utf8 ) ); String utf8Base64 = new String( Base64.encode( utf8 ) ); System.out.println( "Herv\u00e9 utf-8 base 64 encoded : " + utf8Base64 ); byte[] iso8859 = name.getBytes( "ISO-8859-1" ); System.out.println( Strings.dumpBytes( iso8859 ) ); String iso8859Base64 = new String( Base64.encode( iso8859 ) ); System.out.println( "Herv\u00e9 ISO-8859-1 base 64 encoded : " + iso8859Base64 ); }

          produces :

          0x68 0x65 0x72 0x76 0xC3 0xA9
          Hervé utf-8 base 64 encoded : aGVydsOp
          0x68 0x65 0x72 0x76 0xE9
          Hervé ISO-8859-1 base 64 encoded : aGVyduk=

          Show
          Emmanuel Lecharny added a comment - Here is what I get when I use this snippet of code : @Test public void testUtf8Base64() throws UnsupportedEncodingException { String name = new String( "herv\u00e9" ); // Hervé in Unicode byte[] utf8 = Strings.getBytesUtf8( name ); System.out.println( Strings.dumpBytes( utf8 ) ); String utf8Base64 = new String( Base64.encode( utf8 ) ); System.out.println( "Herv\u00e9 utf-8 base 64 encoded : " + utf8Base64 ); byte[] iso8859 = name.getBytes( "ISO-8859-1" ); System.out.println( Strings.dumpBytes( iso8859 ) ); String iso8859Base64 = new String( Base64.encode( iso8859 ) ); System.out.println( "Herv\u00e9 ISO-8859-1 base 64 encoded : " + iso8859Base64 ); } produces : 0x68 0x65 0x72 0x76 0xC3 0xA9 Hervé utf-8 base 64 encoded : aGVydsOp 0x68 0x65 0x72 0x76 0xE9 Hervé ISO-8859-1 base 64 encoded : aGVyduk=
          Hide
          julien2512 added a comment -

          That's definitely not the point !
          Your java code is correct, as the mine is (with almost the same code).

          I've attached two files : "encode-decode LDAP demo 1.PNG" and "encode-decode LDAP demo 2.PNG".
          These pictures were taken from Apache Directory Studio. It does not match your java code results (and nor mine).

          Show
          julien2512 added a comment - That's definitely not the point ! Your java code is correct, as the mine is (with almost the same code). I've attached two files : "encode-decode LDAP demo 1.PNG" and "encode-decode LDAP demo 2.PNG". These pictures were taken from Apache Directory Studio. It does not match your java code results (and nor mine).
          Hide
          julien2512 added a comment -

          Pour faire clair, et ne pas perdre de temps, parlons français :
          Dans Apache Directory Studio (pour windows), menu LDAP, item "Ouvrir l'encodeur / décodeur".
          Le résultat affiché (voir les pièces jointes) ne correspond pas au résultat attendu.

          To be clear, just have a tiny translation in french.
          Go in LDAP menu of Apache Directory Studio (for windows), menu LDAP, item "Open encoder / decoder".
          The result (see the attachements) does not match the result expected.

          Show
          julien2512 added a comment - Pour faire clair, et ne pas perdre de temps, parlons français : Dans Apache Directory Studio (pour windows), menu LDAP, item "Ouvrir l'encodeur / décodeur". Le résultat affiché (voir les pièces jointes) ne correspond pas au résultat attendu. To be clear, just have a tiny translation in french. Go in LDAP menu of Apache Directory Studio (for windows), menu LDAP, item "Open encoder / decoder". The result (see the attachements) does not match the result expected.
          Hide
          Emmanuel Lecharny added a comment -

          Damn! I know see what you mean (thanks for the attachments).

          Ok, we will have a look at this tool. Not sure that it's a good one though : we should let the user select his charset instead of forcing ISO8859-1 (except if it uses the platform locale).

          Show
          Emmanuel Lecharny added a comment - Damn! I know see what you mean (thanks for the attachments). Ok, we will have a look at this tool. Not sure that it's a good one though : we should let the user select his charset instead of forcing ISO8859-1 (except if it uses the platform locale).
          Hide
          Pierre-Arnaud Marcelot added a comment -
          Show
          Pierre-Arnaud Marcelot added a comment - Fixed at revision 1311698. http://svn.apache.org/viewvc?rev=1311698&view=rev
          Hide
          julien2512 added a comment -

          Thank you !

          Show
          julien2512 added a comment - Thank you !
          Hide
          Pierre-Arnaud Marcelot added a comment -

          Many thanks to you for filing the issue.

          Show
          Pierre-Arnaud Marcelot added a comment - Many thanks to you for filing the issue.

            People

            • Assignee:
              Pierre-Arnaud Marcelot
              Reporter:
              julien2512
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - 1h
                1h
                Remaining:
                Remaining Estimate - 1h
                1h
                Logged:
                Time Spent - Not Specified
                Not Specified

                  Development