Bug 3364 - [review] Use of customized header yields "nan" for _HITS_ variable
Summary: [review] Use of customized header yields "nan" for _HITS_ variable
Status: RESOLVED FIXED
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Libraries (show other bugs)
Version: 2.63
Hardware: PC Linux
: P5 normal
Target Milestone: 3.3.0
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard: needs 1 vote
Keywords:
Depends on:
Blocks: 3208
  Show dependency tree
 
Reported: 2004-05-09 12:16 UTC by Daniel Richard G.
Modified: 2009-12-17 15:54 UTC (History)
1 user (show)



Attachment Type Modified Status Actions Submitter/CLA Status
Configuration with custom header declarations text/plain None Daniel Richard G. [NoCLA]
Mailbox of nan-bug-producing spam messages text/plain None Daniel Richard G. [NoCLA]
warn if a plugin tries to contribute a NaN score to the total patch None Mark Martinec [HasCLA]
Deal with NaN in AutoWhitelist and PerMsgStatus patch None Mark Martinec [HasCLA]
Deal with NaN in AutoWhitelist and PerMsgStatus, fix a bad AWL record patch None Mark Martinec [HasCLA]

Note You need to log in before you can comment on or make changes to this bug.
Description Daniel Richard G. 2004-05-09 12:16:18 UTC
I use SpamAssassin with a customized ham/spam-tag header containing the
substring "score=_HITS_", and have noticed in the last few weeks a number of
messages tagged with "score=nan". When I run these through SA without any custom
header declarations, a valid _HITS_ value is always reported correctly in the
X-Spam-Status header.

I will attach to this bug report a pared-down user_prefs file, and an mbox file
of approximately thirty messages from my spam corpus which produce the "nan"
result (though in my somewhat more complex configuration, I believe this bug has
turned up more times than that).

The messages in the mbox file have all been freshly run through SA 2.63 (using
the aforementioned user_prefs file), and all exhibit the erroneous spam-tag header.
Comment 1 Daniel Richard G. 2004-05-09 12:19:23 UTC
Created attachment 1941 [details]
Configuration with custom header declarations

This is a pared-down version of my "production" user_prefs file, enough to
recreate the bug.
Comment 2 Daniel Richard G. 2004-05-09 12:24:41 UTC
Created attachment 1942 [details]
Mailbox of nan-bug-producing spam messages

All the messages in this mailbox file produce the bug in question. They were
all re-tagged in a recent test run using the attached user_prefs file and SA
2.63.
Comment 3 Justin Mason 2004-05-12 21:24:23 UTC
odd.  adding to 3.0.0 queue
Comment 4 Theo Van Dinter 2004-05-19 15:11:36 UTC
can't reproduce this on my boxes, fyi.

can you give some details about your environment?  perl version, platform, etc?  do the default headers 
work ok?

for me:  perl 5.6.1, Linux and perl 5.8.1RC3, Mac OS X.

Looking at perlop:

           perl -le '$a = NaN; print "No NaN support here" if $a == $a'
           perl -le '$a = NaN; print "NaN support here" if $a != $a'

My Linux box says it's supported, Mac OS X says it's not, so this should be reproducable on the Linux 
side, in theory.
Comment 5 Daniel Richard G. 2004-05-20 21:35:24 UTC
This is on an x86 Debian Linux system, current "testing" rev; Perl v5.8.3 and
negative (I repeat, negative!) on the NaN support. The default X-Spam-Status
header---as obtained by commenting out the last three lines of the attached
user_prefs file---works perfectly every time.

(The setup I have here is quite plain-vanilla---Perl and all the system
libraries are untouched Debian packages. I can only hope, of course, that the
problem doesn't relate to any Debian-specific patches.)
Comment 6 Theo Van Dinter 2004-05-20 21:56:05 UTC
just for another data point, I took the 5.8.4 install on my Linux box (RH ES 3 update 2 BTW)...  it says 
NaN isn't supported (must be a 5.8 thing), but both 2.63 and 3.0 have no problem.

I'm willing to get on and do some debugging if I can get access to the machine, btw.
Comment 7 Justin Mason 2004-05-25 19:34:32 UTC
could be a locale thing. suggest we drop this from the 3.0.0 blocker unless
anyone can reproduce it...
Comment 8 Theo Van Dinter 2004-05-25 22:46:21 UTC
> could be a locale thing. suggest we drop this from the 3.0.0 blocker unless
> anyone can reproduce it...

I need to arrange a time with Daniel to get access to the machine to do debugging.   Unfortunately, 
work has been sucking up absolutely all my time recently, so I haven't even had a chance to respond to 
talk about scheduling a time... Sorry Daniel. :|
Comment 9 Malte S. Stretz 2004-05-26 12:40:16 UTC
Daniel: Does this happen only for spam messages? If so, does it start to 
happen only for ham if you swap the two add_header lines in your config file? 
(Just poking around.) 
 
Theo: You seem to know a bit about this NaN stuff, is it maybe related to the 
usage of sprintf()? 
Comment 10 Theo Van Dinter 2004-05-26 12:44:02 UTC
Subject: Re:  Use of customized header yields "nan" for _HITS_ variable

On Wed, May 26, 2004 at 12:40:17PM -0700, bugzilla-daemon@bugzilla.spamassassin.org wrote:
> Theo: You seem to know a bit about this NaN stuff, is it maybe related to the 
> usage of sprintf()? 

Actually I know almost nothing about NaN.  Just what I've picked up doing
google and POD spelunking.  But I figure if I can reproduce the problem,
I can debug the problem, which is a step ahead of where we are now. ;)

Comment 11 Daniel Richard G. 2004-05-27 00:22:56 UTC
Gentlemen, I have bad (good) news:

I have had absolutely no luck in reproducing this bug on other Linux systems,
and what is more, it would not recur in a separate installation of SA _on the
same system_ where the original bug occurred.

Investigating this further, I removed the Debian SA package, and all traces
thereof, and reinstalled it again. Lo and behold, the "nan" behavior was no
more. I re-ran my entire spam corpus through SA, and not one was processed
incorrectly.

It would appear that the problem is specific to the Debian packaging of SA, and
even more, the "bug" turns up only when the package has gone through some
particular upgrade path (as opposed to being freshly installed). Maybe one file
was from an older package revision, maybe a configuration file wasn't updated
correctly,... it could be any number of things.

In light of this, I have resolved this bug as WORKSFORME, and apologize
sincerely for the dead end it has produced. If the problem crops up again, I'll
be more careful to rule out packaging issues, and return here to reopen the bug.
Comment 12 Mark Martinec 2008-10-10 08:01:56 UTC
Created attachment 4375 [details]
warn if a plugin tries to contribute a NaN score to the total
Comment 13 Mark Martinec 2008-10-13 08:27:06 UTC
Reopening a bug based on new evidence from a mailing list, topic
"spam score not counted correctly", problem report by Benedict Verheyen.

from Mark:

> Please try the following patch (to 3.2.5).
> It should produce a warning on stderr when some plugin
> would attempt to add a NaN to score:

> That's puzzling for me too. I far as I can tell, a NaN can only happen
> as a result of floating point arithmetics (like: (-3)**0.5 ), or when
> directly specified in Perl code, e.g. $a = NaN;  I don't think is can
> result from simple string conversions and the like.
>
> Catching which rule or pluging is trying to add it would help narrowing
> down the cause. I hope all score additions go through the now instrumented
> subroutine, otherwise someone more knowledgable in SA internals may
> indicate what additional code paths should add a test for a NaN.

Benedict writes:

> I patched the score part as indicated in Mark's mail and when i run
> spamassassin in debug mode, i do see a message popping up with results
> to a NaN score:
> [6443] warn: !!!!!!!! rules: score 'nan' for rule 'AWL' in 'AWL: '
> 'From: address is in the auto white-list' at /usr/share/perl5/Mail/
>    SpamAssassin/PerMsgStatus.pm line 2146.
>
> The message is now correctly marked as spam and no nan reference
> is printed in the spam report (in debug mode that is)
>
> When i run spamassassin with the lint option, no errors pop up,
> only this message:
> warn: Character in 'C' format wrapped in pack at
> /usr/share/perl5/Mail/SpamAssassin/Util.pm line 800.
> If i edit Util.pm and change the sub my_inet_aton function (more
> specific C4 to U4) i don't get that warning. But testing revealed
> that it doesn't influence the scoring/nan.
>
> As for custom rules, i reinstalled spamassassin (with purge option in
> debian it should be removing all dirs as well) and i'm not using any
> custom rules in my /etc/spamassassin/local.cf.
> I do however alter scores (started because of the nan's influencing
> the scores) in my local.cf
Comment 14 Mark Martinec 2008-10-13 08:34:53 UTC
Created attachment 4376 [details]
Deal with NaN in AutoWhitelist and PerMsgStatus

AutoWhitelist.pm: warn on NaN entering or coming from a database, and ignore it.

Includes my previous patch (4375: warn if a plugin tries to contribute a NaN score to the total).
Comment 15 Mark Martinec 2008-10-14 09:22:53 UTC
Created attachment 4377 [details]
Deal with NaN in AutoWhitelist and PerMsgStatus, fix a bad AWL record
Comment 16 Mark Martinec 2008-10-16 06:45:34 UTC
After Benedict also reported a seemingly unrelated source
of NaN floating-point values ceeping into timing data:

  spamd[1321]: plugin: eval failed:
  Sort subroutine didn't return a numeric value
  at /usr/share/perl5/Mail/SpamAssassin/AsyncLoop.pm line 278.

and telling that he is running SpamAssassin on a UML virtual
machine (http://user-mode-linux.sourceforge.net/), a Google
search revealed a bug in the UML virtual machine:

  http://fixunix.com/openssl/
    518688-re-uml-devel-dev-random-problems-fp-registers-corruption.html

and a fix for it (February 2008):

  UML - Fix FP register corruption
  http://kerneltrap.org/mailarchive/linux-kernel/2008/2/12/829244

Benedict wrote:
> That could well be it. I was running a 2.6.25.4 UML kernel.
> I don't know if that specific kernel version had that error but it seems so.
> I've compiled a new 2.6.27 UML kernel and rebooted the virtual machine with 
> the new kernel. Let's see what happens next :)

and two days later:

> At the moment, so far so good. I haven't seen any nan scores anymore and
> all the emails are with a score > 4 are now flagged as spam like i want.

So it seems the mistery is explained, at least in Benedict's case.

Still, a NaN may well come from a third-party plugin, and as my proposed
patch was able to detect and sanitize a NaN source in Benedict's case,
I propose the patch to be included into 3.2.6. (it already is in the 3.3 cvs
for the last couple of days).

The patch
  "Deal with NaN in AutoWhitelist and PerMsgStatus, fix a bad AWL record)"
has the following function:
- prevent and warn of an attempt of a plugin to add a NaN to the score
  (thus destroying what was collected so far and polluting final result);
- prevent a NaN from entering an AWL database, polluting further SA runs;
- sanitize a NaN in a fetched AWL record by resetting a record,
  when such record happens to be in a database (e.g. from earlier bugs).

Moving into a review state, 2 votes needed...
Comment 17 Justin Mason 2008-10-16 07:29:43 UTC
sure.  +1
Comment 18 Justin Mason 2009-12-17 13:38:01 UTC
is this fixed in 3.3.0?
Comment 19 Mark Martinec 2009-12-17 15:54:40 UTC
> is this fixed in 3.3.0?

Yes, there is a workaround in 3.3 (r704140, r703487 and thereabout).
The bug was retargeted to 3.2.6 just in case, although the problem
apparently only happens on faulty (virtual) hardware and is not common.

I think we can just close it.