Uploaded image for project: 'Directory Studio'
  1. Directory Studio
  2. DIRSTUDIO-963

Why UTF-8 is escaped in DN strings since 2.0.0?

    Details

      Description

      I have a directory that includes many DNs containing UTF-8 characters. I expect the characters to be displayed correctly in the LDAP browser tree. The server claims to support LDAP v3.

      Recently I installed Apache Directory Studio 2.0.0-M8 and realized that new entries created with the Directory Studio have UTF-8 characters escaped when a DN is created, e.g. "TESTСкаж...,dc=ru" is replaced with "TEST\D0\A1\D0\BA\D0\B0\D0\B6...,dc=ru".

      Since escaped secuences make the tree illegible in the LDAP Browser I had to manualy rename new entries using ldapmodrdn utility from OpenLDAP distribution.

      I have tested few prior versions of Apache Directory Studio. Here is a summary of the results:

      1.5.3 - entries created with UTF-8 characters in DN as expected.
      2.0.0-M3 - the Directory fails to load with ArrayIndexOutOfBounds exceptions
      2.0.0-M7, 2.0.0-M8 when DN is formed by the Directory Studio all UTF-8 characters are escaped.

      Is this an expected behaviour or is this a regression? If this behaviour is expected is there a way to get UTF-8 characters back into the DNs?

      1. Screenshot-Rename Entry - 1.5.3.png
        20 kB
        Dmitri Chubarov
      2. Screenshot-Rename Entry - 2.0.0-M8.png
        20 kB
        Dmitri Chubarov

        Issue Links

          Activity

          Hide
          dimch Dmitri Chubarov added a comment -

          It looks like the issue in DIRSTUDIO-938 is not in the rendering, but in the way DN is formed, therefore not exactly a duplicate, but both can be resolved simultaneously.

          Show
          dimch Dmitri Chubarov added a comment - It looks like the issue in DIRSTUDIO-938 is not in the rendering, but in the way DN is formed, therefore not exactly a duplicate, but both can be resolved simultaneously.
          Hide
          dimch Dmitri Chubarov added a comment -

          The problem seen with 2.0.0-M3 has been fixed in 2.0.0-M6 as per DIRSTUDIO-861

          Show
          dimch Dmitri Chubarov added a comment - The problem seen with 2.0.0-M3 has been fixed in 2.0.0-M6 as per DIRSTUDIO-861
          Hide
          dimch Dmitri Chubarov added a comment -

          Here is the escaped DN as generated in 2.0.0-M8

          Show
          dimch Dmitri Chubarov added a comment - Here is the escaped DN as generated in 2.0.0-M8
          Hide
          dimch Dmitri Chubarov added a comment -

          This is a screenshot from 1.5.3 and that is how I would expect the Directory Studio to behave.

          Show
          dimch Dmitri Chubarov added a comment - This is a screenshot from 1.5.3 and that is how I would expect the Directory Studio to behave.
          Hide
          elecharny Emmanuel Lecharny added a comment -

          The DN/RDN which are created with UTF-8 chars must be kept as they were created. The escaping mechanism is just meant to be used in filters.

          If the entry's DN is presented with escaped chars, then it's clearly a bug.

          Show
          elecharny Emmanuel Lecharny added a comment - The DN/RDN which are created with UTF-8 chars must be kept as they were created. The escaping mechanism is just meant to be used in filters. If the entry's DN is presented with escaped chars, then it's clearly a bug.
          Hide
          elecharny Emmanuel Lecharny added a comment -

          I confirm this is a bug.

          o create a new entry
          o select 'person' as an ObjectClass
          o add a 'cn' which value is 'Lécharny'
          o the DN is converted to a String representation of the RDN, where the 'é' is escaped : cn=L\C3\A9charny,ou=system

          This conversion should never occur at this point. RFC 4514 is pretty clear about that :

          distinguishedName = [ relativeDistinguishedName *( COMMA relativeDistinguishedName ) ]
          relativeDistinguishedName = attributeTypeAndValue*( PLUS attributeTypeAndValue )
          attributeTypeAndValue = attributeType EQUALS attributeValue
          attributeValue = string / hexstring
          string = [ ( leadchar / pair ) [ *( stringchar / pair )( trailchar / pair ) ] ]
          leadchar = LUTF1 / UTFMB
          stringchar = SUTF1 / UTFMB
          trailchar = TUTF1 / UTFMB

          and from RFC 4512 :
          UTFMB = UTF2 / UTF3 / UTF4
          UTF2 = %xC2-DF UTF0
          UTF3 = %xE0 %xA0-BF UTF0 | %xE1-EC 2(UTF0) | %xED %x80-9F UTF0 | %xEE-EF 2(UTF0)
          UTF4 = %xF0 %x90-BF 2(UTF0) | %xF1-F3 3(UTF0) | %xF4 %x80-8F 2(UTF0)
          UTF0 = %x80-BF

          Show
          elecharny Emmanuel Lecharny added a comment - I confirm this is a bug. o create a new entry o select 'person' as an ObjectClass o add a 'cn' which value is 'Lécharny' o the DN is converted to a String representation of the RDN, where the 'é' is escaped : cn=L\C3\A9charny,ou=system This conversion should never occur at this point. RFC 4514 is pretty clear about that : distinguishedName = [ relativeDistinguishedName *( COMMA relativeDistinguishedName ) ] relativeDistinguishedName = attributeTypeAndValue*( PLUS attributeTypeAndValue ) attributeTypeAndValue = attributeType EQUALS attributeValue attributeValue = string / hexstring string = [ ( leadchar / pair ) [ *( stringchar / pair )( trailchar / pair ) ] ] leadchar = LUTF1 / UTFMB stringchar = SUTF1 / UTFMB trailchar = TUTF1 / UTFMB and from RFC 4512 : UTFMB = UTF2 / UTF3 / UTF4 UTF2 = %xC2-DF UTF0 UTF3 = %xE0 %xA0-BF UTF0 | %xE1-EC 2(UTF0) | %xED %x80-9F UTF0 | %xEE-EF 2(UTF0) UTF4 = %xF0 %x90-BF 2(UTF0) | %xF1-F3 3(UTF0) | %xF4 %x80-8F 2(UTF0) UTF0 = %x80-BF
          Hide
          seelmann Stefan Seelmann added a comment -

          In the entry create and rename dialog we use the Rdn.escapeValue() method, which seems to have changed, as it escapes now every unicode character, not only some special characters. I have to dig deeper tomorrow.

          Show
          seelmann Stefan Seelmann added a comment - In the entry create and rename dialog we use the Rdn.escapeValue() method, which seems to have changed, as it escapes now every unicode character, not only some special characters. I have to dig deeper tomorrow.
          Hide
          seelmann Stefan Seelmann added a comment -
          Show
          seelmann Stefan Seelmann added a comment - Fixed here http://svn.apache.org/r1596697

            People

            • Assignee:
              seelmann Stefan Seelmann
              Reporter:
              dimch Dmitri Chubarov
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development