SA Bugzilla – Bug 3062
SpamAssassin should be "locale safe"
Last modified: 2013-03-08 00:16:39 UTC
Currently we use stuff like \w at several places. The problem with this is that it changes SA's behaviour if it runs under different locales than C. Example: "äöü" =~ /\w/ is false for LANG=C is true for LANG=de SA's behaviour should be constant, no matter under which locale it happens to run. The best way to achieve this is if it was possible to force utf8 all over the place. But don't ask me how to do that ;-)
possibly, we should force UTF8 (if we can) and fix our code to be UTF-8 safe.
move bug to Future milestone (previously set to Future -- I hope)
Anyone have any idea how much work this would need? IMO something that should be looked into..
> Anyone have any idea how much work this would need? IMO something that should > be looked into.. In amavisd I call a: use POSIX qw(locale_h); POSIX::setlocale(LC_TIME,"C"); early at the start of the master process, so everything underneath works with a C locale, including SpamAssassin. I guess the same could be done in spamd. Not sure if that would cause any surprises or platform dependency problems (e.g. amavisd doesn't need to work under Windows).
I've changed the milestone on this to 3.4.0. Picking a locale for consistent behavior is likely to be a great idea. It may cause some grief in testing but I know people with amavis, etc. that are already doing this in practice.
(In reply to comment #4) > In amavisd I call a: > use POSIX qw(locale_h); > POSIX::setlocale(LC_TIME,"C"); > early at the start of the master process, so everything underneath > works with a C locale, including SpamAssassin. > I guess the same could be done in spamd. (In reply to comment #5) > I've changed the milestone on this to 3.4.0. Picking a locale for > consistent behavior is likely to be a great idea. It may cause some grief > in testing but I know people with amavis, etc. that are already doing this > in practice. Ok, let's see what breaks... Adding to spamd: use POSIX qw(locale_h); POSIX::setlocale(LC_TIME,'C'); trunk: Bug 3062: SpamAssassin should be "locale safe" Sending spamd/spamd.raw Committed revision 1451451.
> Ok, let's see what breaks... > Adding to spamd: > use POSIX qw(locale_h); > POSIX::setlocale(LC_TIME,'C'); So far so good - closing. Please re-open if any problem pops up.