Bug 5518 - spamd gets error: setrgid() not implemented when run as root with -u nobody on MacOS
Summary: spamd gets error: setrgid() not implemented when run as root with -u nobody o...
Status: RESOLVED FIXED
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: spamc/spamd (show other bugs)
Version: SVN Trunk (Latest Devel Version)
Hardware: Other Mac OS X
: P1 blocker
Target Milestone: 3.1.10
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard:
Keywords:
Depends on: 5510
Blocks: 5574
  Show dependency tree
 
Reported: 2007-06-15 04:34 UTC by Sidney Markowitz
Modified: 2007-08-10 22:36 UTC (History)
0 users



Attachment Type Modified Status Actions Submitter/CLA Status
Simple patch implementing Justin's suggested pseudocode patch None Sidney Markowitz [HasCLA]
Patch that uses POSIX::setgid and POSIX::setuid for portability patch None Sidney Markowitz [HasCLA]

Note You need to log in before you can comment on or make changes to this bug.
Description Sidney Markowitz 2007-06-15 04:34:22 UTC
Steps to reproduce: On MacOS X run spamd as root with -u nobody, e.g.,

  sudo --virtual-config-dir=somepath -u nobody

The resulting log output includes the error message setgrid() not implemented

Note that using  different user name for -u, spamd is able to setrgid() without
error.

Analysis: see bug 5510 comment 9 through bug 5510 comment 13 for more details.
Like bug 5510, this problen would make it impossible to install SpamAssassin
from CPAN on MacOS X in the common case that CPAN is run as root.
Comment 1 Sidney Markowitz 2007-06-15 04:35:56 UTC
I had a typo in the description in the line that should have said

 sudo spamd --virtual-config-dir=somepath -u nobody
Comment 2 Justin Mason 2007-06-15 05:19:20 UTC
for reference, comments 9 to 13 from bug 5510:


------- Additional Comment #9 From Sidney Markowitz 2007-06-14 18:37 [reply] -------

This doesn't work for me in MacOS 10.4.9, but the bug may be something in spamd
that I didn't notice before. This error happens when you try as root to run

  spamd --virtual-config-dir=somepath -u nobody

which is what t/spamd_allow_user_rules.t tries to do

[16061] info: spamd: server started on port 48825/tcp (running version
3.2.2-r547235)
[16061] info: spamd: server pid: 16061
[16061] info: spamd: server successfully spawned child process, pid 16065
[16061] dbg: prefork: child 16065: entering state 0
[16061] dbg: prefork: new lowest idle kid: none
[16061] info: spamd: server successfully spawned child process, pid 16066
[16061] dbg: prefork: child 16066: entering state 0
[16061] dbg: prefork: new lowest idle kid: none
[16065] error: setrgid() not implemented at ../spamd/spamd.raw line 1041.
setrgid() not implemented at ../spamd/spamd.raw line 1041.
[16066] error: setrgid() not implemented at ../spamd/spamd.raw line 1041.
setrgid() not implemented at ../spamd/spamd.raw line 1041.
[16061] dbg: prefork: child closed connection


------- Additional Comment #10 From Sidney Markowitz 2007-06-14 19:38 [reply]
-------

Doing some digging I fond this mentioned as a long-time problem with trying to
run spamd as nobody on MacOS X. Both the uid and gid of 'nobody' on the Mac are
-2 and that seems to confuse things as the spamd sees $ugid as having a value of
4294967294 and then $( = $ugid produces the misleading error message that
setrgid is not implemented.

  spamd --virtual-config-dir=somepath -u www

for example executes the $( = $ugid; with no error.


------- Additional Comment #11 From Sidney Markowitz 2007-06-14 19:49 [reply]
-------

getpwnam('nobody') returns the 4294967294 values instead of -2. If at that point
I  force the values of $uuid and $ugid to be -2 then everything seems to work
correctly.

I see some code that appears to recognize the possibility of something like this

      if ( $> != $uuid and $> != ( $uuid - 2**32 ) ) {
        die "spamd: setuid to uid $uuid failed\n";
      }

but I don't really know what is going on with the various tests and workarounds
for BSD-specific behavior so I don't know just how to fix this.


------- Additional Comment #12 From Sidney Markowitz 2007-06-14 21:13 [reply]
-------

More digging: The underlying variables in the C library are unsigned ints. So
getpwnam returns what is there which looks like 4294967294 even though Apple
puts a -2 in /etc/passwd. Assignments to $< and $> are fine with either
4294967294 or -2, doing the same thing in either case. It is only assignments to
$( and $) that flake out if the number is bigger than the maximum positive 32
bit signed int.

However, POSIX::setgid() works fine to set both the real and effective gid with
one call and it accepts 4294967294.

So I suggest that we replace

   $) = "$ugid $ugid";    # effective gid
   $( = $ugid;            # real gid

with
    setgid($ugid);

I'm not expert in this stuff... Does anyone see any problem with using
POSIX::setgid here on other platforms?


------- Additional Comment #13 From Justin Mason 2007-06-15 03:04 [reply] -------

(In reply to comment #12)
> More digging: The underlying variables in the C library are unsigned ints. So
> getpwnam returns what is there which looks like 4294967294 even though Apple
> puts a -2 in /etc/passwd. Assignments to $< and $> are fine with either
> 4294967294 or -2, doing the same thing in either case. It is only assignments to
> $( and $) that flake out if the number is bigger than the maximum positive 32
> bit signed int.

this sounds like a perl bug on MacOS X (another one!! what are they doing over
there?)

> However, POSIX::setgid() works fine to set both the real and effective gid with
> one call and it accepts 4294967294.
> 
> So I suggest that we replace
> 
>    $) = "$ugid $ugid";    # effective gid
>    $( = $ugid;            # real gid
> 
> with
>     setgid($ugid);
> 
> I'm not expert in this stuff... Does anyone see any problem with using
> POSIX::setgid here on other platforms?

it may not be implemented, for one, which would cause a die().  We'd have
to protect it with an eval ' ... ' block.

you'd probably have to do sth like this:

  - run the $) / $( assignments in an eval '...' block
  - capture errors
  - if error =~ /setrgid\(\) not implemented/, then
    - run POSIX::setgid(...)
    - die if that failed too

however would it be simpler to just do (pseudocode)

  if (platform eq MacOS && ($uid = getpwnam($user)) == 4294967294) { $uid = -2; }

if that works, that would be better than attempting to use another syscall
entirely, IMO.

Also I suggest that should be opened as a totally separate bug; it's not caused
by this one, it's just being _exposed_ by the fix.

Comment 3 Justin Mason 2007-06-15 05:22:39 UTC
btw, I've added "t/root_spamd_virtual.t" as a test case for this.  Sidney, could
you verify that fails for you?
Comment 4 Sidney Markowitz 2007-06-15 07:09:02 UTC
Yes, it fails on the machine that I've been testing this on, which is an Intel
MacBook running Apple's perl 5.8.6.

I've started a test on a PPC iMac that has ActiveState's perl 5.8.8 and will
know in a few minutes if that does anything different.
Comment 5 Sidney Markowitz 2007-06-15 08:17:02 UTC
Copied from bug 5510 comment 18

------- Additional Comment From Mark Martinec  2007-06-15 08:09  [reply] -------

(In reply to comment #13)
Assignments to $< and $> are fine with either 4294967294 or -2
[...] assignments to $( and $) that flake out if the number
is bigger than the maximum positive 32 bit signed int.
> 
> this sounds like a perl bug on MacOS X

> However, POSIX::setgid() works fine to set both the real
> and effective gid with

> you'd probably have to do sth like this:
>   - run the $) / $( assignments in an eval '...' block
>   - capture errors
>   - if error =~ /setrgid\(\) not implemented/, then
>     - run POSIX::setgid(...)
>     - die if that failed too

If POSIX::setgid and POSIX::setuid are available (and I can't think
why they wouldn't be), by all means, use ONLY these, to avoid
all the $<, $>, $( and $) perl crap, which is causing all kinds
of inconsistencies on various platforms.  The Net::Server (which
is used by amavisd-new) had a series of problem-reports until
it finally switched to POSIX::setgid and POSIX::setuid.

For illustration (not that I claim it is best), this is what
currently stands in Net::Server::Daemonize.pm:

sub set_uid {
  my $uid = get_uid( shift() );
  POSIX::setuid($uid);
  if ($< != $uid || $> != $uid) { # check $> also (rt #21262)
    $< = $> = $uid; # try again-needed by some 5.8.0 linux systems (rt #13450)
    if ($< != $uid) {
      die "Couldn't become uid \"$uid\": $!\n";
    }
  }
  return 1;
}

sub set_gid {
  my $gids = get_gid( @_ );
  my $gid  = (split /\s+/, $gids)[0];
  eval { $) = $gids }; # store all the gids - this is really sort of optional
  POSIX::setgid($gid);
  if (! grep {$gid == $_} split /\s+/, $() { # look for any valid id in list
    die "Couldn't become gid \"$gid\": $!\n";
  }
  return 1;
}

Comment 6 Sidney Markowitz 2007-06-15 08:25:02 UTC
Created attachment 3993 [details]
Simple patch implementing Justin's suggested pseudocode

Martin, I agree in principle, but I would like to hear if anyone knows of any
issues with POSIX::setgid on any other platforms such as Windows.

In the meantime, here is a very minimal patch along the lines that Justin
suggested. It has the advantage of being very conservative and safe, in that it
doesn't change what we have been doing on any platform other than MacOS.

I propose we check this in now and then can take some time to investigate if
POSIX::setgid does work ok on all the platforms that spamd runs on.

I'll wait to hear feedback before I check this into trunk and put this in
review status for 3.2 branch.
Comment 7 Justin Mason 2007-06-15 08:27:26 UTC
I wonder... if Net::Server is using POSIX::setuid and POSIX::setgid, maybe
that's the way to go, since they'll have blazed the trail and hit all the bugs
for us ;)

Another thing: it should be possible to ensure we have a good set of test cases,
now that we have the "t/root*" run-as-root test suite.
Comment 8 Sidney Markowitz 2007-06-15 08:33:52 UTC
> I wonder... if Net::Server is using POSIX::setuid and POSIX::setgid,
> maybe that's the way to go

I wouldn't argue with that. If we switch to POSIX::setuid too, that means we can
get rid of the bug 3900 code in that same part of spamd.raw and in
Mail::SA::Util.pm?
Comment 9 Mark Martinec 2007-06-15 08:41:20 UTC
> > I wonder... if Net::Server is using POSIX::setuid and POSIX::setgid,
> > maybe that's the way to go
> 
> If we switch to POSIX::setuid too, that means we can get rid of
> the bug 3900 code in that same part of spamd.raw and in
> Mail::SA::Util.pm?

I have no idea how that behaves on Windows (Net::Server doesn't run
there I think).

I'm quite familiar with the issues we had with setting UID:
since Perl can't assign to both of $< and $> in one atomic operation,
one or the other order of assignment failed on some platform, the only
universal solution was POSIX::setuid, which does both in one go.

About setting groups I don't think these were as troublesome as
setting uid. I think POSIX::setgid is used now mostly because
POSIX::setuid needs to be used anyhow.
Comment 11 Justin Mason 2007-06-15 08:57:55 UTC
(In reply to comment #8)
> > I wonder... if Net::Server is using POSIX::setuid and POSIX::setgid,
> > maybe that's the way to go
> 
> I wouldn't argue with that. If we switch to POSIX::setuid too, that means we can
> get rid of the bug 3900 code in that same part of spamd.raw and in
> Mail::SA::Util.pm?

well, we can certainly do some consolidation of that code; I'd be hesitant to
remove too much of the sanity-checks though.
Comment 12 Mark Martinec 2007-06-15 09:00:10 UTC
...also:
http://rt.cpan.org/Public/Bug/Display.html?id=5450
Comment 13 Mark Martinec 2007-06-15 09:05:03 UTC
...and this is what amavisd-new does, if it is allowed to change uid
and gid by itself (otherwise Net::Server does it):

  $( = $gid;  # real GID
  $) = "$gid $gid";  # effective GID
  POSIX::setuid($uid) or die "Can't setuid to $uid: $!";
  $> = $uid; $< = $uid;  # just in case
# print STDERR "desired user=$desired_user ($uid), current: EUID: $> ($<)\n";
# print STDERR "desired group=$desired_group, current: EGID: $) ($()\n";
  $> != 0 or die "Still running as root, aborting\n";
  $< != 0 or die "Effective UID changed, but Real UID is 0\n";

and later on, to make sure Net::Server did the right thing:

  my($euid) = $>;   # effective UID
  $> = 0;           # try to become root
  POSIX::setuid(0)  if $> != 0;  # and try some more
  if ($> == 0 || $euid == 0) {   # succeded? panic!
    die ...
  }
Comment 14 Sidney Markowitz 2007-06-15 11:19:12 UTC
> I'd be hesitant to
> remove too much of the sanity-checks though

Sanity checks are fine. I was thinking more about the strange code that's
labeled as workaround for BSD peculiarities that can all be replaced with a call
to POSIX::setuid. At least that's what I recall from when I looked at it --
right now I'm about to get on an airplane and can't take time for another close
look.

> this is what amavisd-new does, if it is allowed to
> change uid and gid by itself
>
>  $( = $gid;  # real GID
>  $) = "$gid $gid";  # effective GID

That's exactly what will break if done under MacOS as root to try to change the
group to 'nobody' is $gid was set from getpwnam(), and what I want to replace
with POSIX::setgid() if we don't kludge in something just in the case of MacOS
to change the unsigned ints into signed ints.
Comment 15 Mark Martinec 2007-06-15 11:30:57 UTC
> Sanity checks are fine. I was thinking more about the strange code that's
> labeled as workaround for BSD peculiarities that can all be replaced with
> a call to POSIX::setuid.

Yes, that is my main point here.

> >  $( = $gid;  # real GID
> >  $) = "$gid $gid";  # effective GID 
> That's exactly what will break if done under MacOS as root to try to
> change the group to 'nobody' is $gid was set from getpwnam(), and what
> I want to replace with POSIX::setgid() if we don't kludge in something

I know, I'll have to apply the same fix for changing group as you do.
Fortunately for me most users leave that task to Net::Server, which seems
to get it right.
Comment 16 Mark Martinec 2007-06-15 11:42:46 UTC
> I know, I'll have to apply the same fix for changing group as you do.
> Fortunately for me most users leave that task to Net::Server, which
> seems to get it right.

... and amavisd never runs as user or group nobody
(it is supposed to have its dedicated username),
so don't rely on my claims about setgid.
My fighting with mills was all about setuid woes...
Comment 17 Mark Martinec 2007-06-15 11:49:44 UTC
> My fighting with mills was all about setuid woes...

The proper term would be 'a crusade' about setuid woes...
I'm too chatty today, sorry, I'll step down now :)
Comment 18 Sidney Markowitz 2007-06-15 14:34:41 UTC
Created attachment 3994 [details]
Patch that uses POSIX::setgid and POSIX::setuid for portability

Here's a patch that I composed on the airplane to Christchurch just now, so it
is not as thoroughly tested as it might otherwise be. I'm off to a conference
here and may not be all that responsive until I've made it through my
presentation four days from now.
Comment 19 Sidney Markowitz 2007-06-15 17:01:18 UTC
I did get to test it on my MacBook and it all looks fine. I'm not able to test
this on other OSs since my virtual machines images are not with me. If I don't
see any objections after some arbitrary amount of time I'll check this in to
trunk and put it up for review for 3.2 branch. I would like to see someone
verify it on a BSD system that demonstrated the problem of bug 3586 before we
fixed it to make sure that this patch doesn't regress that bug.

Comment 20 Mark Martinec 2007-06-15 18:17:37 UTC
> I would like to see someone verify it on a BSD system that
> demonstrated the problem of bug 3586 before we fixed it
> to make sure that this patch doesn't regress that bug.

Looks good on:
  FreeBSD xxx FreeBSD 6.2-RELEASE-p5 #0: ... amd64

[82058] dbg: config: copying current conf to backup
[82058] info: spamd: server started on port 783/tcp (running version 3.2.1)
[82058] info: spamd: server pid: 82058
[82058] info: spamd: server successfully spawned child process, pid 82060
[82058] dbg: prefork: child 82060: entering state 0
[82058] dbg: prefork: new lowest idle kid: none
[82060] dbg: prefork: sysread(8) not ready, wait max 300 secs
[82058] info: spamd: server successfully spawned child process, pid 82061
[82058] dbg: prefork: child 82061: entering state 0
[82058] dbg: prefork: new lowest idle kid: none

root     82058 58022  3.4  1.2 74720 48820 S+     0:00.93 /usr/local/bin/perl
  -T -w /usr/local/bin/spamd --virtual-config-dir=/home/mark/00 -u nobody -D
nobody   82060 82058  0.0  1.2 74728 48852 S+     0:00.01 spamd child (perl)
nobody   82061 82058  0.0  1.2 74728 48904 S+     0:00.01 spamd child (perl)

but it was fine before as well (nobody is 65534).


If I revert the change in bug 3586 (on 3.2.1), it is again fine,
but a workaround-in-use is reported:

[87675] info: spamd: server successfully spawned child process, pid 87677
[87675] dbg: prefork: child 87677: entering state 0
[87675] dbg: prefork: new lowest idle kid: none
[87677] dbg: spamd: initial attempt to change real uid failed, trying BSD 
workaround
[87675] info: spamd: server successfully spawned child process, pid 87678
[87677] dbg: prefork: sysread(8) not ready, wait max 300 secs

root     87675 87445  5.5  1.2 74716 48840 S+     0:00.93 /usr/local/bin/perl
  -T -w /usr/local/bin/spamd --virtual-config-dir=/home/mark/00 -u nobody -D
nobody   87677 87675  0.0  1.2 74720 48916 S+     0:00.01 spamd child (perl)
nobody   87678 87675  0.0  1.2 74720 48872 S+     0:00.01 spamd child (perl)

So I can say the new patch didn't break anything on FreeBSD.
Comment 21 Sidney Markowitz 2007-06-15 18:33:03 UTC
Committed to trunk revision 547841.

Please review and vote for committing to 3.2 branch.
Comment 22 Justin Mason 2007-06-18 10:41:56 UTC
trunk, "sudo prove -v t/root*" on 5.8.7 on linux x86: works fine

trunk, !! on 5.8.8 on solaris 10: can't tell, because I just broke sudo ;) I'll
retest once it's fixed.

I'd also like to hear that the non-spamd parts work on win32...
is anyone running spamd on win32? if so, we don't want to break that,
hopefully.

also, RUNNING_ON_MACOS seems superfluous btw.
Comment 23 Justin Mason 2007-06-18 10:46:35 UTC
perl 5.6.1 on linux x86: also ok
Comment 24 Justin Mason 2007-06-20 05:29:50 UTC
(In reply to comment #22)
> trunk, !! on 5.8.8 on solaris 10: can't tell, because I just broke sudo ;) I'll
> retest once it's fixed.

yep, it all works (I had to fix an unrelated bug).

+1
Comment 25 Daryl C. W. O'Shea 2007-07-04 12:00:22 UTC
+1
Comment 26 Sidney Markowitz 2007-07-04 12:29:53 UTC
Committed to branch 3.2 revision 553307.
Comment 27 Sidney Markowitz 2007-07-17 05:48:22 UTC
Reopening to target to 3.1.10

I vote +1 to apply this patch to the 3.1 branch
Comment 28 Justin Mason 2007-07-18 06:36:22 UTC
+1
Comment 29 Daryl C. W. O'Shea 2007-07-23 01:37:54 UTC
+1
Comment 30 Sidney Markowitz 2007-07-23 02:51:25 UTC
Committed to branch 3.1 revision 558676.
Comment 31 Justin Mason 2007-07-25 11:39:58 UTC
don't release 3.1.10; this code breaks under perl 5.6.1 after all, it seems :(
see bug 5574 for details.
Comment 32 Sidney Markowitz 2007-08-10 22:36:45 UTC
Fixed in bug 5574