SA Bugzilla – Bug 4138
spamd hanging on infinite loop
Last modified: 2011-10-05 19:42:25 UTC
I'm using SA 3.0.2 on a CentOS 3.4 Linux system in a spamd/spamc setup. spamc is called via 'maildrop' with user privileges. Now the problem is, I have a message in my queue (postfix) that seems to hang spamd. Every time it tries local delivery, spamc times out after 10 minutes but the spamd child is going on using the CPU forever, without doing any syscalls. The last thing spamd is telling in debug mode is it is doing 'tokenize:' steps. It's not specific to one message, and there are also messages succeeding for the same user. The problem is related to the bayes* files in the users $HOME/.spamassassin/ directory. Erasing them fixes the problem. Setting the old files back reintroduces the problem. A sa-learn --backup and a --restore also seems to fix the problem. So the bug is no showstopper, but I think even feeding random bits to my Bayes database shouldn't make a spamd go wild. BTW: if one sends the hanging spamd child a TERM, it is terminated nicely and the parent spawns a new child. I can reproduce the problem with the bayes files and a mail file which I will attach to this ticket.
Is it trying to do an expire (are their *.expire* files in the .spamassassin dir)?
Created attachment 2651 [details] Email message triggering the problem
Created attachment 2652 [details] bayes_seen file
Can't attach bayes_toks: "The file you are trying to attach is 1232 kilobytes (KB) in size. Non-patch attachments cannot be more than 1000 KB."
Subject: Re: spamd hanging on infinite loop Also, possibly you've got a corrupted bayes db file. Try compressing with gzip or bzip2. If it's a corrupted bayes file, it will be required to debug.
(In reply to comment #1) > Is it trying to do an expire (are their *.expire* files in the .spamassassin dir)? No, no such files around.
Created attachment 2653 [details] bayes_toks gzipped file
Subject: Re: spamd hanging on infinite loop Attached message causes no problem on my stock 3.0.2 install, with my own bayes db. Does it slow down when using spamassassin or only spamd/spamc?
(In reply to comment #8) > Subject: Re: spamd hanging on infinite loop > > Attached message causes no problem on my stock 3.0.2 install, with my > own bayes db. That's right. You both need the message and the db files to reproduce the problem. > Does it slow down when using spamassassin or only spamd/spamc? I didn't try spamassassin, only spamd/spamc.
Subject: Re: spamd hanging on infinite loop > > I didn't try spamassassin, only spamd/spamc. > Can you please capture some debug output (via -D) , with the broken bayes dbs in place for spamd and spamassassin?
Created attachment 2654 [details] debug output spamd -D -d -c -m5 -H
Subject: Re: spamd hanging on infinite loop Also, output from spamassassin -D --lint Thanks Michael
Created attachment 2655 [details] output spamassassin -D --lint (runned as the affected user)
I've started a little poor man's debugging myself: It gets stuck writing to the bayes_seen hash. From learn_trapped, it succeeds tokenizing, stores the tokens, calls $self->{store}->seen_put , seen_put_direct and than hangs on the line: $self->{db_seen}->{$msgid} = $seen; called with the values b9eccae48c5e4921fcffbe305d11195f1953e45a@sa_generated, h Cheers,
This really looks like a corrupt bayes_seen db file. Did you already rm the old files and start from scratch?
Michael, "This really looks like a corrupt bayes_seen db file. Did you already rm the old files and start from scratch?" Per the original report, "Erasing them fixes the problem. Setting the old files back reintroduces the problem. A sa-learn --backup and a --restore also seems to fix the problem." Henk, after you've deleted the Bayes files, does the problem come back, or is it gone for good? If it comes back, then if it is caused by Bayes corruption, there's a problem that's /causing/ the corruption that should be looked into, if possible. Even so, why would /this/ message aggravate the corruption and cause a spamd/spamc hang, but not most others? Henk, you included the debug output from spamd, which ends with the header tokenize. I don't know spamd well enough to know whether there should then be body tokenize lines. It might be that tokenize is failing on the body. Perhaps on those long lines of encoded equal signs? You attached a "spamassassin -D --lint", but not a "spamassassin -D <message" output. Can you do that latter, so we can compare the two debug outputs as they actually attempt to process the message, the same message, against the same problematic bayes files?
closing WORKSFORME until we hear more. typically hangs when writing a value to a Berkeley DB file are indicative of: (a) corrupt db files (b) or bugs in that library, which we cannot fix
Supposing this is the same bug as the Debian bug 295451, it may be a matter of passing O_EXCL to DB_File. http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=295451
Closing WORKSFORME, again. Hasn't been updated in 5 years. Target is 3.1.1.