Bug 3062 - SpamAssassin should be "locale safe"
Summary: SpamAssassin should be "locale safe"
Status: RESOLVED FIXED
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Libraries (show other bugs)
Version: SVN Trunk (Latest Devel Version)
Hardware: All All
: P3 normal
Target Milestone: 3.4.0
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-02-18 14:59 UTC by Malte S. Stretz
Modified: 2013-03-08 00:16 UTC (History)
2 users (show)



Attachment Type Modified Status Actions Submitter/CLA Status

Note You need to log in before you can comment on or make changes to this bug.
Description Malte S. Stretz 2004-02-18 14:59:18 UTC
Currently we use stuff like \w at several places. The problem with this is 
that it changes SA's behaviour if it runs under different locales than C. 
 
Example: 
  "äöü" =~ /\w/ 
is false for LANG=C 
is true  for LANG=de 
 
SA's behaviour should be constant, no matter under which locale it happens to 
run. The best way to achieve this is if it was possible to force utf8 all over 
the place. But don't ask me how to do that ;-)
Comment 1 Justin Mason 2004-02-28 16:14:28 UTC
possibly, we should force UTF8 (if we can) and fix our code to be UTF-8 safe.
Comment 2 Daniel Quinlan 2005-03-30 01:09:13 UTC
move bug to Future milestone (previously set to Future -- I hope)
Comment 3 Henrik Krohns 2011-05-02 12:45:00 UTC
Anyone have any idea how much work this would need? IMO something that should be looked into..
Comment 4 Mark Martinec 2011-05-04 18:24:18 UTC
> Anyone have any idea how much work this would need? IMO something that should
> be looked into..

In amavisd I call a:

  use POSIX qw(locale_h);
  POSIX::setlocale(LC_TIME,"C");

early at the start of the master process, so everything underneath
works with a C locale, including SpamAssassin.

I guess the same could be done in spamd.

Not sure if that would cause any surprises or platform dependency
problems (e.g. amavisd doesn't need to work under Windows).
Comment 5 Kevin A. McGrail 2011-05-28 09:56:12 UTC
I've changed the milestone on this to 3.4.0.  Picking a locale for consistent behavior is likely to be a great idea.  It may cause some grief in testing but I know people with amavis, etc. that are already doing this in practice.
Comment 6 Mark Martinec 2013-03-01 01:27:11 UTC
(In reply to comment #4)
> In amavisd I call a:
>   use POSIX qw(locale_h);
>   POSIX::setlocale(LC_TIME,"C");
> early at the start of the master process, so everything underneath
> works with a C locale, including SpamAssassin.
> I guess the same could be done in spamd.

(In reply to comment #5)
> I've changed the milestone on this to 3.4.0.  Picking a locale for
> consistent behavior is likely to be a great idea.  It may cause some grief
> in testing but I know people with amavis, etc. that are already doing this
> in practice.

Ok, let's see what breaks...
Adding to spamd:
  use POSIX qw(locale_h);
  POSIX::setlocale(LC_TIME,'C');

trunk:
  Bug 3062: SpamAssassin should be "locale safe"
Sending spamd/spamd.raw
Committed revision 1451451.
Comment 7 Mark Martinec 2013-03-08 00:16:39 UTC
> Ok, let's see what breaks...
> Adding to spamd:
>   use POSIX qw(locale_h);
>   POSIX::setlocale(LC_TIME,'C');

So far so good - closing.
Please re-open if any problem pops up.