SA Bugzilla – Bug 5437
warn: Malformed UTF-8 character
Last modified: 2007-04-27 02:48:11 UTC
I am running SpamAssassin 3.2.0-rc3. I got a lot of when I run spamassassin -D -t -L < sample: "warn: Malformed UTF-8 character (unexpected non-continuation byte 0x00, immediately after start byte 0xce) in pattern match (m//) at /var/lib/spamassassin/3.002000/70_sare_obfu0_cf_sare_sa-update_dostech_net/200510012000.cf, rule SARE_OBFU_VICODIN, line 1." and rule SARE_OBFU_VICODIN look like this: body SARE_OBFU_VICODIN /(?!\bVICODIN\b)(?:\b([vu])(?:\1){0,2}|\B(\\\/|\xCE\xBD)(?:\2){0,2})[\W\d_]? ([il1:\|\*ý-ýý-ýý]|\xC4[\xA8-\xB0]|\xC4\xBA|\xC4\xBC|\xC4\xBE|\xC5\x80|\xC5\x82 |\xC7[\x8F-\x90]|\xD0[\x86-\x87]|\xD1[\x96-\x97]|\xCE\x8A|\xCE\x90|\xCE\x99|\xCE \xAA|\xCE\xAF|\xCE\xB9|\xCF\x8A)(?:\3){0,2}[\W\d_]?([c\*ýýýý]|\xC4[\x86-\x8D]| \xD0\xA1|\xD1\x81)(?:\4){0,2}[\W\d_]?([o0\*ýýýýý-ýý-ý]|\(\)|\[\]|\xC5[\x8C-\x91] |\xC6[\xA0-\xA1]|\xC7[\x91-\x92]|\xC7[\xBE-\xBF]|\xCE\x8C|\xCE\x98|\xCE\x9F|\xCE\xB8|\xCE\xBF|\xCF\x8C|\xD0\x9E|\xD0\xBE|\xD5\x95)(?:\5){0,2}[\W\d_]?([dý]|\xC4[\x8E-\x91])(?:\6){0,2}[\W\d_]?([il1:\|\*ý-ýý-ýý]|\xC4[\xA8-\xB0]|\xC4\xBA|\xC4\xBC|\xC4\xBE|\xC5\x80|\xC5\x82|\xC7[\x8F-\x90]|\xD0[\x86-\x87]|\xD1[\x96-\x97]|\xCE\x8A|\xCE\x90|\xCE\x99|\xCE\xAA|\xCE\xAF|\xCE\xB9|\xCF\x8A)(?:\7){0,2}[\W\d_]?(?:(n)(?:\8){0,2}\b|([ýý]|\|\\\||\xC5[\x83-\x8B]|\xCE\x9D|\xCE\xA0|\xCE\xAE|\xCE\xB7|\xD5\xB2|\xD5\xB8)(?:\9){0,2}\B)/i
Created attachment 3923 [details] sample email
Created attachment 3924 [details] attchment of spamassassin -D -L -t < 14350. > result 2>&1
add use bytes; in Message.pm make the problem gone
"use bytes" in Message.pm would also break normalize_charset support. :(
*** This bug has been marked as a duplicate of 3787 ***