SA Bugzilla – Bug 3097
Make spamd and (possibly) mass-check use Net::Server
Last modified: 2005-01-21 23:52:34 UTC
spamd currently does a "fork per incoming message" which is rather inefficient. I'd rather us do a "Prefork" method where we fork off -m children (or do a min/max thing) and pass out incoming messages to free children. Instead of reinventing the wheel, Net::Server already can do this, and would likely not be that hard to integrate into the code. Since we'd be calling Net::Server and not copying in their code, there's no license issue. This was discussed a little bit already on the spamassassin-dev list, fyi.
been chatting with folks ... Some have commented that Net::Server may be overkill, and also requires people install Net::Server so it's another module dependency. http://www.ijs.si/software/amavisd/#faq-net-server documents some issues with Net::Server as well. I was pointed at http://search.cpan.org/dist/POE/ ... That's another external dependency though. I figure there'll be some effort required to either code up a pre-fork method or to modify our code to use other modules that are out there. I'm not thrilled with yet-another external dependency though. I use spampd at work, which uses Net::Server, which is why I knew about it, and it has the Prefork/ PreforkSimple code that I was thinking about doing... Any thoughts?
Subject: Re: Make spamd and (possibly) mass-check use Net::Server On Sat, Feb 28, 2004 at 05:49:43PM -0800, bugzilla-daemon@bugzilla.spamassassin.org wrote: > > Any thoughts? > +1 for using Net::Server I've also used Net::Daemon in the past with success. Michael
no matter which way we go, I was thinking today that there'll have to be new code added in as well. specifically code to deal with switching user back correctly between calls, and general cleanup. right now the fork-per-message model deals with most of that by making it all moot. :|
Just a comment from a passer-by (found this by grepping the web for "spamd prefork", perl's compile time is killing us). I used Net::Daemon some time ago on another project and was not happy with the memory footprint. That was an earlier version, but I'd recommend writing a trivial Net::Daemon application and checking memory consumption against the equivalent written without the module. No experience w/ Net::Server. With persistent processes it might be possible to compile regexes once, no? This would be another big win.
'With persistent processes it might be possible to compile regexes once, no? This would be another big win.' spamd already does this.
I started working on this the other day. I came to the conclusion that we should do a homebrew method to get this done. Net::Server would be good, but has a few sticky issues, a main one being that it doesn't natively support SSL connections. (there's an expirimental set of code available to do this, but I'd rather not base our code on that right now...)
ok, I committed a new spamd this evening. r10000 (yea!)
Subject: Re: Make spamd and (possibly) mass-check use Net::Server -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 >ok, I committed a new spamd this evening. r10000 (yea!) You got it -- damn! I was planning to check in some "insert blank line" fake-commit to grab that one ;) - --j. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (GNU/Linux) Comment: Exmh CVS iD8DBQFAfMd+QTcbUG5Y7woRAozQAKCU0nTB3TL5kVP+5sjLPCL8akRBnACg1STk tqHUnwZFtBhNkEFV+3Yb9cQ= =JA99 -----END PGP SIGNATURE-----
Subject: Re: Make spamd and (possibly) mass-check use Net::Server I hate to be the wet blanket here, but does the pre-forking actually help performance?
I'm curious about this too given that a server will tend to take a steady fork/sec rate regardless of forking for each connection or maintaining a available pool. Reuse of a child in a pool would be something altogether different but would require the removal of some of the existing features. Perhaps spamd's model could depend on it's configuration options? If a admin has configured spamd to setuid we prefork and the child goes away, but if spamd only runs as one user, the child could return to the pool to be reused? I have no idea if the factory, once used, can be returned to it's pristine state or what other issues could prevent this. (That is, if supporting it isn't enough on it's own.)
Subject: Re: Make spamd and (possibly) mass-check use Net::Server On Wed, Apr 14, 2004 at 01:09:47AM -0700, bugzilla-daemon@bugzilla.spamassassin.org wrote: > I'm curious about this too given that a server will tend to take a steady > fork/sec rate regardless of forking for each connection or maintaining a > available pool. Reuse of a child in a pool would be something altogether Yes, but the prefork method will have a much lower fork/sec rate than the fork-per-message model. For example: using the defaults of 5 children and 1000 messages per child before exiting... assume 5000 messages go through: fork-per-message will do 5000 fork() calls, once for each message. prefork will do approx. 10 fork() calls, 5 at the start, and approx. 5 at the end (depends if each child got 1000 connections evenly distributed...) > different but would require the removal of some of the existing features. Really? How so? The code I put in reuses the children all the time, and I didn't get rid of anything spamd already did.
>Really? How so? The code I put in reuses the children all the time, >and I didn't get rid of anything spamd already did. I should probably go read the code but doesn't that imply that if spamd is running as root with setuid enabled that the child is maintaining a real uid of root?
Subject: Re: Make spamd and (possibly) mass-check use Net::Server On Wed, Apr 14, 2004 at 11:11:28AM -0700, bugzilla-daemon@bugzilla.spamassassin.org wrote: > I should probably go read the code but doesn't that imply that if spamd is > running as root with setuid enabled that the child is maintaining a real uid of > root? Yes. All of the children are running with the same RUID. However, when a message comes in, as before, the child will change EUID as appropriate, and do the processing. In the new code, however, after processing the EUID will change back to the RUID (as appropriate), as well as having the now potentially modified config reverted to the original version.
Subject: Re: Make spamd and (possibly) mass-check use Net::Server On Tue, Apr 13, 2004 at 10:46:37PM -0700, bugzilla-daemon@bugzilla.spamassassin.org wrote: > I hate to be the wet blanket here, but does the pre-forking actually > help performance? It depends on your environment. A few things it does off the top of my head: 1) prefork in general is better than fork-per-message wrt performance. I haven't done any testing to show numbers, but you can think about it easily: for 5000 messages, the old spamd has to fork() 5000 times, versus the new version which (with default values) has to only fork 6-10 times. the flip side is that the prefork version needs to backup/restore the configuration per message, and revert EUID, etc. 2) we can do things like persistent connections to the SQL server from spamd which will be a big improvement if you use SQL (see related ticket about that) 3) -m will work more correctly -- a pool of X processes are spawned and maintained.
Subject: Re: Make spamd and (possibly) mass-check use Net::Server On Tue, Apr 13, 2004 at 10:46:37PM -0700, bugzilla-daemon@bugzilla.spamassassin.org wrote: > ------- Additional Comments From quinlan@pathname.com 2004-04-13 22:46 ------- > Subject: Re: Make spamd and (possibly) mass-check use Net::Server > > I hate to be the wet blanket here, but does the pre-forking actually > help performance? > Yes, I saw a 10-15% speed up in my bayes DBM benchmarks. I haven't tried a SQL benchmark but it should be close to the same. We should in theory get an even greater speedup once we start caching database handles between messages. Michael
Subject: Re: Make spamd and (possibly) mass-check use Net::Server -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 >> different but would require the removal of some of the existing features. > >Really? How so? The code I put in reuses the children all the time, >and I didn't get rid of anything spamd already did. Well, Kelsey has a good point -- spamd uses setuid() when run as root. (However changing it to use seteuid() when changing root->user, then calling seteuid() to change user->root after the request is processed, may work around this) - --j. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (GNU/Linux) Comment: Exmh CVS iD8DBQFAfXkcQTcbUG5Y7woRAtsdAJ49yG2f8qyfZ70TFBiIB0Ck0kjElgCfcONU BKE5HMLASo0vhVdoEmfC500= =PQB5 -----END PGP SIGNATURE-----
>Yes. All of the children are running with the same RUID. However, when While I'm personally not that bothered by this much, maintaining a real UID of root through a child process isn't something to do lightly. If there is ever an exploit found in a child and the RUID is root, the box is 0wn3d. If the RUID=EUID and EUID!=0 than the attacker has user level access to the server which may lead to other problems if the box is locally exploitable. I'd assumed that the project would want to maintain this level of paranoia which is why I said that some features (like setuid) would have to be pulled to do a preforker/reuse server (securely.) I don't want to rock the boat or get presumptuous on how big my tiny role with the project is but I'd personally like to see a concensus from the cabal at least that this is the way to go.
Subject: Re: Make spamd and (possibly) mass-check use Net::Server On Wed, Apr 14, 2004 at 11:54:41AM -0700, bugzilla-daemon@bugzilla.spamassassin.org wrote: > I'd assumed that the project would want to maintain this level of paranoia which > is why I said that some features (like setuid) would have to be pulled to do a > preforker/reuse server (securely.) I don't want to rock the boat or get > presumptuous on how big my tiny role with the project is but I'd personally like > to see a concensus from the cabal at least that this is the way to go. I'd just like to mention that this specific behavior is the _exact same_ as spamd has always done. There's absoluetely no change in how this works in my new code. old code: daemon is root, waits for connection. when connection is accepted, fork() a process to deal with said connection. if check() is called, change euid appropriately, do the check. child returns result, child dies. new code: daemon is root, spawns children, also running as root. 1: children wait for connection. random child accepts connection, deals with connection. if check() is called, change euid appropriately, do the check. child returns result. changes euid back to root, resets configuration to pre-connection version. child closes connection. if less than 1000 connections have been processed, loop to 1. else, child dies and parent spawns a new child, loop to 1. if we want to make it more secure, that's great, but this isn't an issue due to the new code, so is OT for this ticket IMHO.
(Good to hear that Theo's already done the RUID=0 EUID=user thing btw.) But Kelsey makes a good point. 'While I'm personally not that bothered by this much, maintaining a real UID of root through a child process isn't something to do lightly. If there is ever an exploit found in a child and the RUID is root, the box is 0wn3d.' Yes -- good point. we do fork extensively in the scanner, e.g. to run dcc, pyzor etc. We will need to figure this out. Traditionally, the approach is to identify the points where the forking will take place, and rewrite so that after the fork, a RUID=EUID takes place (typically using a helper function that tests if RUID==0 and EUID!=0 and setuids if so). the resulting RUID/EUID map would be: in master process ($$=master): RUID=0 EUID=0 in preforked process ($$=prefork): RUID=0 EUID=0 get username, setuid ($$=prefork): RUID=0 EUID=user scanner ($$=prefork): RUID=0 EUID=user fork to run pyzor ($$=pyzorhelper): RUID=user EUID=user return results ($$=prefork): RUID=0 EUID=user after scanner finishes ($$=prefork): RUID=0 EUID=0 Note that historically privilege-escalation attacks in setuid apps have revolved around finding places where RUID=0, EUID=user, and a fork is done, and replacing what gets forked (or replacing a key env var like PATH, LD_LIBRARY_PATH, LD_PRELOAD or IFS). Note also that on some platforms, some system calls (like getcwd()) will cause a fork. :( 'If the RUID=EUID and EUID!=0 than the attacker has user level access to the server which may lead to other problems if the box is locally exploitable.' That was the previous situation, and is preferable to the above. I think we want to make sure we keep to this situation.
Subject: Re: Make spamd and (possibly) mass-check use Net::Server On Wed, Apr 14, 2004 at 11:45:01AM -0700, bugzilla-daemon@bugzilla.spamassassin.org wrote: > Well, Kelsey has a good point -- spamd uses setuid() when run > as root. No it doesn't. There's exactly 1 setuid() call (aka: $< = #), and that's in the initial pre-spawn area if -u is set. Still the same code. Everything else is done via seteuid() (aka: $> = #). > (However changing it to use seteuid() when > changing root->user, then calling seteuid() to change user->root > after the request is processed, may work around this) That's what I did.
Subject: Re: Make spamd and (possibly) mass-check use Net::Server On Wed, Apr 14, 2004 at 12:20:25PM -0700, bugzilla-daemon@bugzilla.spamassassin.org wrote: > http://bugzilla.spamassassin.org/show_bug.cgi?id=3097 > > > > > > ------- Additional Comments From felicity@kluge.net 2004-04-14 12:20 ------- > Subject: Re: Make spamd and (possibly) mass-check use Net::Server > > On Wed, Apr 14, 2004 at 11:45:01AM -0700, bugzilla-daemon@bugzilla.spamassassin.org wrote: > > Well, Kelsey has a good point -- spamd uses setuid() when run > > as root. > > No it doesn't. There's exactly 1 setuid() call (aka: $< = #), and that's > in the initial pre-spawn area if -u is set. Still the same code. > > Everything else is done via seteuid() (aka: $> = #). You are absolutely correct. I even checked all the way back to 1.5 to see if it had been dropped in a rewrite but it was never there. Now, otoh, it doesn't mean spamd shouldn't have been setting the EUID and RUID all along. Perhaps we can put a warning in the documentation that states the dangers inherent in a child maintaining a RUID of root and suggest that if people have issue with this that they run spamd as an unpriveledged user with SQL or virtualuser configs and leave it at that?
Subject: Re: Make spamd and (possibly) mass-check use Net::Server -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 > I'd just like to mention that this specific behavior is the _exact same_ > as spamd has always done. There's absoluetely no change in how this > works in my new code. > > old code: > daemon is root, waits for connection. when connection is accepted, fork() > a process to deal with said connection. if check() is called, change > euid appropriately, do the check. child returns result, child dies. Actually, I was pretty sure the old code set *both* euid and ruid to user. (checks) you're right. hmmm... OK, we should fix that ;) - --j. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (GNU/Linux) Comment: Exmh CVS iD8DBQFAfZmdQTcbUG5Y7woRAtv4AKCKvcrEmoid3NA1MDnBmtc1lBLSGwCfUc12 3h68Lr2RCVBh18Ol6m0v4NA= =qcwf -----END PGP SIGNATURE-----
Created attachment 2618 [details] straced spamd straced spamd process during a hang.
wtf bugzilla... ignore that attachment :)