Bug 5437 - warn: Malformed UTF-8 character
Summary: warn: Malformed UTF-8 character
Status: RESOLVED DUPLICATE of bug 3787
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Rules (show other bugs)
Version: SVN Trunk (Latest Devel Version)
Hardware: PC Linux
: P5 normal
Target Milestone: Undefined
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-04-25 10:43 UTC by Vincent Li
Modified: 2007-04-27 02:48 UTC (History)
0 users



Attachment Type Modified Status Actions Submitter/CLA Status
sample email text/plain None Vincent Li [NoCLA]
attchment of spamassassin -D -L -t < 14350. > result 2>&1 text/plain None Vincent Li [NoCLA]

Note You need to log in before you can comment on or make changes to this bug.
Description Vincent Li 2007-04-25 10:43:15 UTC
I am running SpamAssassin 3.2.0-rc3. I got a lot of when I run spamassassin -D
-t -L < sample:

"warn: Malformed UTF-8 character (unexpected non-continuation byte 0x00,
immediately after start byte 0xce) in pattern match (m//) at
/var/lib/spamassassin/3.002000/70_sare_obfu0_cf_sare_sa-update_dostech_net/200510012000.cf,
rule SARE_OBFU_VICODIN, line 1."

and rule SARE_OBFU_VICODIN look like this:

body      SARE_OBFU_VICODIN       
/(?!\bVICODIN\b)(?:\b([vu])(?:\1){0,2}|\B(\\\/|\xCE\xBD)(?:\2){0,2})[\W\d_]?
([il1:\|\*ý-ýý-ýý]|\xC4[\xA8-\xB0]|\xC4\xBA|\xC4\xBC|\xC4\xBE|\xC5\x80|\xC5\x82
|\xC7[\x8F-\x90]|\xD0[\x86-\x87]|\xD1[\x96-\x97]|\xCE\x8A|\xCE\x90|\xCE\x99|\xCE
\xAA|\xCE\xAF|\xCE\xB9|\xCF\x8A)(?:\3){0,2}[\W\d_]?([c\*ýýýý]|\xC4[\x86-\x8D]|
\xD0\xA1|\xD1\x81)(?:\4){0,2}[\W\d_]?([o0\*ýýýýý-ýý-ý]|\(\)|\[\]|\xC5[\x8C-\x91]
|\xC6[\xA0-\xA1]|\xC7[\x91-\x92]|\xC7[\xBE-\xBF]|\xCE\x8C|\xCE\x98|\xCE\x9F|\xCE\xB8|\xCE\xBF|\xCF\x8C|\xD0\x9E|\xD0\xBE|\xD5\x95)(?:\5){0,2}[\W\d_]?([dý]|\xC4[\x8E-\x91])(?:\6){0,2}[\W\d_]?([il1:\|\*ý-ýý-ýý]|\xC4[\xA8-\xB0]|\xC4\xBA|\xC4\xBC|\xC4\xBE|\xC5\x80|\xC5\x82|\xC7[\x8F-\x90]|\xD0[\x86-\x87]|\xD1[\x96-\x97]|\xCE\x8A|\xCE\x90|\xCE\x99|\xCE\xAA|\xCE\xAF|\xCE\xB9|\xCF\x8A)(?:\7){0,2}[\W\d_]?(?:(n)(?:\8){0,2}\b|([ýý]|\|\\\||\xC5[\x83-\x8B]|\xCE\x9D|\xCE\xA0|\xCE\xAE|\xCE\xB7|\xD5\xB2|\xD5\xB8)(?:\9){0,2}\B)/i
Comment 1 Vincent Li 2007-04-25 10:47:20 UTC
Created attachment 3923 [details]
sample email
Comment 2 Vincent Li 2007-04-25 10:49:32 UTC
Created attachment 3924 [details]
attchment of spamassassin -D -L -t < 14350. > result 2>&1
Comment 3 Vincent Li 2007-04-25 14:20:05 UTC
add use bytes; in Message.pm make the problem gone
Comment 4 Justin Mason 2007-04-25 14:53:41 UTC
"use bytes" in Message.pm would also break normalize_charset support. :(
Comment 5 Justin Mason 2007-04-27 02:48:11 UTC

*** This bug has been marked as a duplicate of 3787 ***