Uploaded image for project: 'Commons Lang'
  1. Commons Lang
  2. LANG-729

StringEscapeUtils.unescapeXml(str) does not support supplemental characters.

Agile BoardAttach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Trivial
    • Resolution: Fixed
    • 2.6
    • 3.0
    • lang.*

    Description

      StringEscapeUtils.unescapeXml(str) does not unescape numeric character references of supplemental characters:

      String str2 = StringEscapeUtils.unescapeXml("𣎴");
      System.out.println(str2.codePointAt(0));
      //38 (it means '&'.)

      This output should be 144308.

      Currently, StringEscapeUtils.unescapeXml(StringEscapeUtils.escapeXml(str)) is equal to str, so it doesn't seem to be wrong. But, as we reported in LANG-728, StringEscapeUtils.escapeXml(str) has a bug. When the bug is fixed, StringEscapeUtils.unescapeXml(StringEscapeUtils.escapeXml(str)) would not be equal to str. We do not expect it. (Of course, we don't expect that StringEscapeUtils.unescapeXml(StringEscapeUtils.escapeXml(str)) is always equal to str.)

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            yabuki Taro Yabuki
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment