Bug 3097 - Make spamd and (possibly) mass-check use Net::Server
Summary: Make spamd and (possibly) mass-check use Net::Server
Status: RESOLVED FIXED
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: spamc/spamd (show other bugs)
Version: SVN Trunk (Latest Devel Version)
Hardware: Other other
: P3 enhancement
Target Milestone: 3.0.0
Assignee: Theo Van Dinter
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 2037 2171 3208
  Show dependency tree
 
Reported: 2004-02-26 23:04 UTC by Theo Van Dinter
Modified: 2005-01-21 23:52 UTC (History)
1 user (show)



Attachment Type Modified Status Actions Submitter/CLA Status
straced spamd application/x-gzip None Dallas Engelken [HasCLA]

Note You need to log in before you can comment on or make changes to this bug.
Description Theo Van Dinter 2004-02-26 23:04:50 UTC
spamd currently does a "fork per incoming message" which is rather inefficient.  I'd rather us do a 
"Prefork" method where we fork off -m children (or do a min/max thing) and pass out incoming 
messages to free children.

Instead of reinventing the wheel, Net::Server already can do this, and would likely not be that hard to 
integrate into the code.  Since we'd be calling Net::Server and not copying in their code, there's no 
license issue.

This was discussed a little bit already on the spamassassin-dev list, fyi.
Comment 1 Theo Van Dinter 2004-02-28 17:49:42 UTC
been chatting with folks ...  Some have commented that Net::Server may be overkill, and also requires 
people install Net::Server so it's another module dependency.

http://www.ijs.si/software/amavisd/#faq-net-server documents some issues with Net::Server as well.

I was pointed at http://search.cpan.org/dist/POE/ ...  That's another external dependency though.


I figure there'll be some effort required to either code up a pre-fork method or to modify our code to 
use other modules that are out there.  I'm not thrilled with yet-another external dependency though.  I 
use spampd at work, which uses Net::Server, which is why I knew about it, and it has the Prefork/
PreforkSimple code that I was thinking about doing...

Any thoughts?
Comment 2 Michael Parker 2004-02-28 18:01:27 UTC
Subject: Re:  Make spamd and (possibly) mass-check use Net::Server

On Sat, Feb 28, 2004 at 05:49:43PM -0800, bugzilla-daemon@bugzilla.spamassassin.org wrote:
> 
> Any thoughts?
> 

+1 for using Net::Server

I've also used Net::Daemon in the past with success.

Michael

Comment 3 Theo Van Dinter 2004-03-06 22:15:46 UTC
no matter which way we go, I was thinking today that there'll have to be new 
code added in as well.  specifically code to deal with switching user back 
correctly between calls, and general cleanup. right now the fork-per-message 
model deals with most of that by making it all moot. :|
Comment 4 William White 2004-03-29 10:26:55 UTC
Just a comment from a passer-by (found this by grepping the web for "spamd
prefork", perl's compile time is killing us).

I used Net::Daemon some time ago on another project and was not happy with the
memory footprint.  That was an earlier version, but I'd recommend writing a
trivial Net::Daemon application and checking memory consumption against the
equivalent written without the module.  No experience w/ Net::Server.

With persistent processes it might be possible to compile regexes once, no? 
This would be another big win.
Comment 5 Justin Mason 2004-03-29 11:41:27 UTC
'With persistent processes it might be possible to compile regexes once, no? 
This would be another big win.'

spamd already does this.
Comment 6 Theo Van Dinter 2004-04-10 10:27:13 UTC
I started working on this the other day.  I came to the conclusion that we 
should do a homebrew method to get this done.  Net::Server would be good, but 
has a few sticky issues, a main one being that it doesn't natively support SSL 
connections.  (there's an expirimental set of code available to do this, but I'd 
rather not base our code on that right now...)
Comment 7 Theo Van Dinter 2004-04-13 20:25:00 UTC
ok, I committed a new spamd this evening.   r10000  (yea!)
Comment 8 Justin Mason 2004-04-13 22:09:27 UTC
Subject: Re:  Make spamd and (possibly) mass-check use Net::Server 

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


>ok, I committed a new spamd this evening.   r10000  (yea!)

You got it -- damn!   I was planning to check in some "insert
blank line" fake-commit to grab that one ;)

- --j.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFAfMd+QTcbUG5Y7woRAozQAKCU0nTB3TL5kVP+5sjLPCL8akRBnACg1STk
tqHUnwZFtBhNkEFV+3Yb9cQ=
=JA99
-----END PGP SIGNATURE-----

Comment 9 Daniel Quinlan 2004-04-13 22:46:36 UTC
Subject: Re:  Make spamd and (possibly) mass-check use Net::Server

I hate to be the wet blanket here, but does the pre-forking actually
help performance?

Comment 10 Kelsey Cummings 2004-04-14 01:09:46 UTC
I'm curious about this too given that a server will tend to take a steady
fork/sec rate regardless of forking for each connection or maintaining a
available pool. Reuse of a child in a pool would be something altogether
different but would require the removal of some of the existing features. 
Perhaps spamd's model could depend on it's configuration options?  If a admin
has configured spamd to setuid we prefork and the child goes away, but if spamd
only runs as one user, the child could return to the pool to be reused?  I have
no idea if the factory, once used, can be returned to it's pristine state or
what other issues could prevent this.  (That is, if supporting it isn't enough
on it's own.)
Comment 11 Theo Van Dinter 2004-04-14 08:04:13 UTC
Subject: Re:  Make spamd and (possibly) mass-check use Net::Server

On Wed, Apr 14, 2004 at 01:09:47AM -0700, bugzilla-daemon@bugzilla.spamassassin.org wrote:
> I'm curious about this too given that a server will tend to take a steady
> fork/sec rate regardless of forking for each connection or maintaining a
> available pool. Reuse of a child in a pool would be something altogether

Yes, but the prefork method will have a much lower fork/sec rate than
the fork-per-message model.  For example:

using the defaults of 5 children and 1000 messages per child before
exiting...  assume 5000 messages go through:

fork-per-message will do 5000 fork() calls, once for each message.

prefork will do approx. 10 fork() calls, 5 at the start, and approx. 5 at
the end (depends if each child got 1000 connections evenly distributed...)

> different but would require the removal of some of the existing features. 

Really?  How so?  The code I put in reuses the children all the time,
and I didn't get rid of anything spamd already did.

Comment 12 Kelsey Cummings 2004-04-14 11:11:27 UTC
>Really?  How so?  The code I put in reuses the children all the time,
>and I didn't get rid of anything spamd already did.

I should probably go read the code but doesn't that imply that if spamd is
running as root with  setuid enabled that the child is maintaining a real uid of
root?
Comment 13 Theo Van Dinter 2004-04-14 11:21:07 UTC
Subject: Re:  Make spamd and (possibly) mass-check use Net::Server

On Wed, Apr 14, 2004 at 11:11:28AM -0700, bugzilla-daemon@bugzilla.spamassassin.org wrote:
> I should probably go read the code but doesn't that imply that if spamd is
> running as root with  setuid enabled that the child is maintaining a real uid of
> root?

Yes.  All of the children are running with the same RUID.  However, when
a message comes in, as before, the child will change EUID as appropriate,
and do the processing.  In the new code, however, after processing the
EUID will change back to the RUID (as appropriate), as well as having
the now potentially modified config reverted to the original version.

Comment 14 Theo Van Dinter 2004-04-14 11:29:27 UTC
Subject: Re:  Make spamd and (possibly) mass-check use Net::Server

On Tue, Apr 13, 2004 at 10:46:37PM -0700, bugzilla-daemon@bugzilla.spamassassin.org wrote:
> I hate to be the wet blanket here, but does the pre-forking actually
> help performance?

It depends on your environment.  A few things it does off the top of my head:

1) prefork in general is better than fork-per-message wrt performance.
   I haven't done any testing to show numbers, but you can think about
   it easily: for 5000 messages, the old spamd has to fork() 5000 times,
   versus the new version which (with default values) has to only fork
   6-10 times.  the flip side is that the prefork version needs to
   backup/restore the configuration per message, and revert EUID, etc.

2) we can do things like persistent connections to the SQL server from
   spamd which will be a big improvement if you use SQL (see related
   ticket about that)

3) -m will work more correctly -- a pool of X processes are spawned
   and maintained.

Comment 15 Michael Parker 2004-04-14 11:32:32 UTC
Subject: Re:  Make spamd and (possibly) mass-check use Net::Server

On Tue, Apr 13, 2004 at 10:46:37PM -0700, bugzilla-daemon@bugzilla.spamassassin.org wrote:
> ------- Additional Comments From quinlan@pathname.com  2004-04-13 22:46 -------
> Subject: Re:  Make spamd and (possibly) mass-check use Net::Server
> 
> I hate to be the wet blanket here, but does the pre-forking actually
> help performance?
> 

Yes, I saw a 10-15% speed up in my bayes DBM benchmarks.  I haven't
tried a SQL benchmark but it should be close to the same.  We should
in theory get an even greater speedup once we start caching database
handles between messages.

Michael

Comment 16 Justin Mason 2004-04-14 11:45:01 UTC
Subject: Re:  Make spamd and (possibly) mass-check use Net::Server 

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


>> different but would require the removal of some of the existing features. 
>
>Really?  How so?  The code I put in reuses the children all the time,
>and I didn't get rid of anything spamd already did.

Well, Kelsey has a good point -- spamd uses setuid() when run
as root.   (However changing it to use seteuid() when
changing root->user, then calling seteuid() to change user->root
after the request is processed, may work around this)

- --j.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFAfXkcQTcbUG5Y7woRAtsdAJ49yG2f8qyfZ70TFBiIB0Ck0kjElgCfcONU
BKE5HMLASo0vhVdoEmfC500=
=PQB5
-----END PGP SIGNATURE-----

Comment 17 Kelsey Cummings 2004-04-14 11:54:40 UTC
>Yes.  All of the children are running with the same RUID.  However, when

While I'm personally not that bothered by this much,  maintaining a real UID of
root through a child process isn't something to do lightly.   If there is ever
an exploit found in a child and the RUID is root, the box is 0wn3d.  If the
RUID=EUID and EUID!=0 than the attacker has user level access to the server
which may lead to other problems if the box is locally exploitable.

I'd assumed that the project would want to maintain this level of paranoia which
is why I said that some features (like setuid) would have to be pulled to do a
preforker/reuse server (securely.)  I don't want to rock the boat or get
presumptuous on how big my tiny role with the project is but I'd personally like
to see a concensus from the cabal at least that this is the way to go.
Comment 18 Theo Van Dinter 2004-04-14 12:17:45 UTC
Subject: Re:  Make spamd and (possibly) mass-check use Net::Server

On Wed, Apr 14, 2004 at 11:54:41AM -0700, bugzilla-daemon@bugzilla.spamassassin.org wrote:
> I'd assumed that the project would want to maintain this level of paranoia which
> is why I said that some features (like setuid) would have to be pulled to do a
> preforker/reuse server (securely.)  I don't want to rock the boat or get
> presumptuous on how big my tiny role with the project is but I'd personally like
> to see a concensus from the cabal at least that this is the way to go.

I'd just like to mention that this specific behavior is the _exact same_
as spamd has always done.  There's absoluetely no change in how this
works in my new code.

old code:
daemon is root, waits for connection.  when connection is accepted, fork()
a process to deal with said connection.  if check() is called, change
euid appropriately, do the check.  child returns result, child dies.

new code:
daemon is root, spawns children, also running as root.  1: children
wait for connection.  random child accepts connection, deals with
connection.  if check() is called, change euid appropriately, do
the check.  child returns result.  changes euid back to root, resets
configuration to pre-connection version.  child closes connection.
if less than 1000 connections have been processed, loop to 1.  else,
child dies and parent spawns a new child, loop to 1.


if we want to make it more secure, that's great, but this isn't an issue
due to the new code, so is OT for this ticket IMHO.

Comment 19 Justin Mason 2004-04-14 12:17:54 UTC
(Good to hear that Theo's already done the RUID=0 EUID=user thing btw.)
But Kelsey makes a good point.

'While I'm personally not that bothered by this much,  maintaining a real UID of
root through a child process isn't something to do lightly.   If there is ever
an exploit found in a child and the RUID is root, the box is 0wn3d.'

Yes -- good point.  we do fork extensively in the scanner, e.g. to run dcc,
pyzor etc.   We will need to figure this out.

Traditionally, the approach is to identify the points where the forking will
take place, and rewrite so that after the fork, a RUID=EUID takes place
(typically using a helper function that tests if RUID==0 and EUID!=0 and setuids
if so).

the resulting RUID/EUID map would be:

in master process ($$=master):       RUID=0 EUID=0
in preforked process ($$=prefork):   RUID=0 EUID=0
get username, setuid ($$=prefork):   RUID=0 EUID=user
scanner ($$=prefork):                RUID=0 EUID=user
fork to run pyzor ($$=pyzorhelper):  RUID=user EUID=user
return results ($$=prefork):         RUID=0 EUID=user
after scanner finishes ($$=prefork): RUID=0 EUID=0

Note that historically privilege-escalation attacks in setuid apps have revolved
around finding places where RUID=0, EUID=user, and a fork is done, and replacing
what gets forked (or replacing a key env var like PATH, LD_LIBRARY_PATH,
LD_PRELOAD or IFS).

Note also that on some platforms, some system calls (like getcwd()) will cause a
fork. :(

'If the
RUID=EUID and EUID!=0 than the attacker has user level access to the server
which may lead to other problems if the box is locally exploitable.'

That was the previous situation, and is preferable to the above.  I think we
want to make sure we keep to this situation.

Comment 20 Theo Van Dinter 2004-04-14 12:20:24 UTC
Subject: Re:  Make spamd and (possibly) mass-check use Net::Server

On Wed, Apr 14, 2004 at 11:45:01AM -0700, bugzilla-daemon@bugzilla.spamassassin.org wrote:
> Well, Kelsey has a good point -- spamd uses setuid() when run
> as root.

No it doesn't.  There's exactly 1 setuid() call (aka: $< = #), and that's
in the initial pre-spawn area if -u is set.  Still the same code.

Everything else is done via seteuid() (aka: $> = #).

> (However changing it to use seteuid() when
> changing root->user, then calling seteuid() to change user->root
> after the request is processed, may work around this)

That's what I did.

Comment 21 Kelsey Cummings 2004-04-14 13:02:59 UTC
Subject: Re:  Make spamd and (possibly) mass-check use Net::Server

On Wed, Apr 14, 2004 at 12:20:25PM -0700, bugzilla-daemon@bugzilla.spamassassin.org wrote:
> http://bugzilla.spamassassin.org/show_bug.cgi?id=3097
> 
> 
> 
> 
> 
> ------- Additional Comments From felicity@kluge.net  2004-04-14 12:20 -------
> Subject: Re:  Make spamd and (possibly) mass-check use Net::Server
> 
> On Wed, Apr 14, 2004 at 11:45:01AM -0700, bugzilla-daemon@bugzilla.spamassassin.org wrote:
> > Well, Kelsey has a good point -- spamd uses setuid() when run
> > as root.
> 
> No it doesn't.  There's exactly 1 setuid() call (aka: $< = #), and that's
> in the initial pre-spawn area if -u is set.  Still the same code.
> 
> Everything else is done via seteuid() (aka: $> = #).

You are absolutely correct.  I even checked all the way back to 1.5 to see
if it had been dropped in a rewrite but it was never there.  Now, otoh, it
doesn't mean spamd shouldn't have been setting the EUID and RUID all along.

Perhaps we can put a warning in the documentation that states the dangers
inherent in a child maintaining a RUID of root and suggest that if people
have issue with this that they run spamd as an unpriveledged user with
SQL or virtualuser configs and leave it at that?

Comment 22 Justin Mason 2004-04-14 13:05:57 UTC
Subject: Re:  Make spamd and (possibly) mass-check use Net::Server 

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


> I'd just like to mention that this specific behavior is the _exact same_
> as spamd has always done.  There's absoluetely no change in how this
> works in my new code.
> 
> old code:
> daemon is root, waits for connection.  when connection is accepted, fork()
> a process to deal with said connection.  if check() is called, change
> euid appropriately, do the check.  child returns result, child dies.

Actually, I was pretty sure the old code set *both* euid and ruid to
user.  (checks) you're right. hmmm...  OK, we should fix that ;)

- --j.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFAfZmdQTcbUG5Y7woRAtv4AKCKvcrEmoid3NA1MDnBmtc1lBLSGwCfUc12
3h68Lr2RCVBh18Ol6m0v4NA=
=qcwf
-----END PGP SIGNATURE-----

Comment 23 Dallas Engelken 2005-01-22 08:49:37 UTC
Created attachment 2618 [details]
straced spamd

straced spamd process during a hang.
Comment 24 Dallas Engelken 2005-01-22 08:52:34 UTC
wtf bugzilla... ignore that attachment :)