Log4net
  1. Log4net
  2. LOG4NET-229

Japanese characters get garbled with log4net.Layout.XmlLayoutSchemaLog4j

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.2.10
    • Fix Version/s: 1.2.11
    • Component/s: Appenders
    • Labels:
      None
    • Environment:
      log4net 1.2.10, .net 2.0

      Description

      with XmlLayoutSchemaLog4j ,all (as far as I see) of Japanese characters are replaced with '?'
      because log4net.Util.Transform.INVALIDCHARS regular expression is not correct.
      this issue may be affect in other languages, as Chinese, Korean or like that.

      http://issues.apache.org/jira/browse/LOG4NET-22 says that permitted chars are

      #x9 | #xA | #xD | x20-#xD7FF | xE000-#xFFFD | x10000-#x10FFFF

      , but regex for invalid characters are

      private static Regex INVALIDCHARS=new Regex(@"[^\x09\x0A\x0D\x20-\xFF\u00FF-\u07FF\uE000-\uFFFD]",RegexOptions.Compiled);

      so 0x0800 ~ 0xD7FF are mistreated as invalid character.

      and 0xD800 ~ 0xDFFF sould also be permitted because these characters are used to express 0x10000 ~ 0x10FFFF in UTF-16
      (0xD800 ~ 0xDFFF in unicode are invalid, but in UTF-16 they are ok)

      so regex INVALIDCHARS shold be "[^\x09\x0A\x0D\x20-\u00FF\uFFFD]"
      (above code is NOT TESTED)

        Activity

        Hide
        Stefan Bodewig added a comment -

        fixed ith svn revision 1167144

        Show
        Stefan Bodewig added a comment - fixed ith svn revision 1167144

          People

          • Assignee:
            Unassigned
            Reporter:
            Atsushi Suzuki
          • Votes:
            1 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - 2h
              2h
              Remaining:
              Remaining Estimate - 2h
              2h
              Logged:
              Time Spent - Not Specified
              Not Specified

                Development