Bug 4650 - child timeout processing eventually causes spamd to hang
child timeout processing eventually causes spamd to hang
Status: RESOLVED WORKSFORME
Product: Spamassassin
Classification: Unclassified
Component: spamc/spamd
3.1.0
HP Linux
: P5 normal
: 3.3.0
Assigned To: SpamAssassin Developer Mailing List
:
Depends on:
Blocks:
  Show dependency tree
 
Reported: 2005-10-28 16:12 UTC by James E.J. Bottomley
Modified: 2008-04-21 02:03 UTC (History)
2 users (show)



Attachment Type Modified Status Actions Submitter/CLA Status

Note You need to log in before you can comment on or make changes to this bug.
Description James E.J. Bottomley 2005-10-28 16:12:29 UTC
This problem is seen with the debian version: 3.1.0a-1

I run spamassassin on a very slow machine (HP B180) which means that operations
take much longer than on a modern ix86 system.  The symptoms I see are that
eventually spamd fails to fork.  when I do a ps, there are 5 (from the
--max-children 5 argument) spamd processess, all owned by my users, all idle but
none returning to the spamd fork pool.  For each of them there's a message in
the log saying

spamd[26351]: bayes: expire_old_tokens: child processing timeout at
/usr/sbin/spamd line1088. 

(however, there are more of these lines than the five children).  What it looks
like is that there's some error leg after timeout where the child fails to
return to the prefork pool.

On an unmodified spamd installation, this causes spamd to become unusable after
about a day of processing emails.

I've worked around this problem on my system by adding the option 
--timeout-child 3600 and now spamd has been running properly for the last week.
Comment 1 Dallas Engelken 2005-10-28 17:08:39 UTC
see bug 3828 for my comments regarding child timeouts and bayes expiry.

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=3828#c38

and theo's response on c39

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=3828#c39

i guess the secondary helper call never made it into 3.1..

d
Comment 2 Dallas Engelken 2005-10-28 17:12:38 UTC
also, to avoid relying on auto-expiry by spamd, you can disable auto-expiry, 
and run sa-learn --force-expire (as the user you run spamd as) via cron on a 
semi regular basis.  this will mitigate the issue until a better solution is in 
place for auto-expiry via spamd.
Comment 3 Dallas Engelken 2005-10-28 17:22:19 UTC
okay.. i hate to keep writing follow-ups, but i keep thinking of things. ;)

have you converted to sql-driven bayes??  my token expiration runs usually take 
less than 10 seconds now that i have bayes in sql.  this may be a more elegant 
solution to the problem. 

# time sa-learn --force-expire

real    0m3.894s
user    0m1.220s
sys     0m0.060s


Comment 4 James E.J. Bottomley 2005-10-28 17:47:36 UTC
Actually, I'm perfectly happy currently with the solution I'm employing.  The
reason for reporting isn't the time it takes (I have CPU time in abundance on
the network gateway) it's the bug that the child processes don't return to the
prefork pool causing spamd to hang eventually.  That one needs looking at before
it trips someone else up.
Comment 5 Justin Mason 2006-12-11 04:06:24 UTC
if you can ever capture -D debug logs, or strace logs, for this situation, that
would be very helpful.
Comment 6 Justin Mason 2006-12-12 12:40:18 UTC
moving RFEs and low-priority stuff to 3.3.0 target
Comment 7 Justin Mason 2008-04-21 02:03:36 UTC
this is a very old bug, and hasn't been touched in several years; it's probably not an issue with current versions.  If this is not the case, feel free to reopen.