SA Bugzilla – Bug 5518
spamd gets error: setrgid() not implemented when run as root with -u nobody on MacOS
Last modified: 2007-08-10 22:36:45 UTC
Steps to reproduce: On MacOS X run spamd as root with -u nobody, e.g., sudo --virtual-config-dir=somepath -u nobody The resulting log output includes the error message setgrid() not implemented Note that using different user name for -u, spamd is able to setrgid() without error. Analysis: see bug 5510 comment 9 through bug 5510 comment 13 for more details. Like bug 5510, this problen would make it impossible to install SpamAssassin from CPAN on MacOS X in the common case that CPAN is run as root.
I had a typo in the description in the line that should have said sudo spamd --virtual-config-dir=somepath -u nobody
for reference, comments 9 to 13 from bug 5510: ------- Additional Comment #9 From Sidney Markowitz 2007-06-14 18:37 [reply] ------- This doesn't work for me in MacOS 10.4.9, but the bug may be something in spamd that I didn't notice before. This error happens when you try as root to run spamd --virtual-config-dir=somepath -u nobody which is what t/spamd_allow_user_rules.t tries to do [16061] info: spamd: server started on port 48825/tcp (running version 3.2.2-r547235) [16061] info: spamd: server pid: 16061 [16061] info: spamd: server successfully spawned child process, pid 16065 [16061] dbg: prefork: child 16065: entering state 0 [16061] dbg: prefork: new lowest idle kid: none [16061] info: spamd: server successfully spawned child process, pid 16066 [16061] dbg: prefork: child 16066: entering state 0 [16061] dbg: prefork: new lowest idle kid: none [16065] error: setrgid() not implemented at ../spamd/spamd.raw line 1041. setrgid() not implemented at ../spamd/spamd.raw line 1041. [16066] error: setrgid() not implemented at ../spamd/spamd.raw line 1041. setrgid() not implemented at ../spamd/spamd.raw line 1041. [16061] dbg: prefork: child closed connection ------- Additional Comment #10 From Sidney Markowitz 2007-06-14 19:38 [reply] ------- Doing some digging I fond this mentioned as a long-time problem with trying to run spamd as nobody on MacOS X. Both the uid and gid of 'nobody' on the Mac are -2 and that seems to confuse things as the spamd sees $ugid as having a value of 4294967294 and then $( = $ugid produces the misleading error message that setrgid is not implemented. spamd --virtual-config-dir=somepath -u www for example executes the $( = $ugid; with no error. ------- Additional Comment #11 From Sidney Markowitz 2007-06-14 19:49 [reply] ------- getpwnam('nobody') returns the 4294967294 values instead of -2. If at that point I force the values of $uuid and $ugid to be -2 then everything seems to work correctly. I see some code that appears to recognize the possibility of something like this if ( $> != $uuid and $> != ( $uuid - 2**32 ) ) { die "spamd: setuid to uid $uuid failed\n"; } but I don't really know what is going on with the various tests and workarounds for BSD-specific behavior so I don't know just how to fix this. ------- Additional Comment #12 From Sidney Markowitz 2007-06-14 21:13 [reply] ------- More digging: The underlying variables in the C library are unsigned ints. So getpwnam returns what is there which looks like 4294967294 even though Apple puts a -2 in /etc/passwd. Assignments to $< and $> are fine with either 4294967294 or -2, doing the same thing in either case. It is only assignments to $( and $) that flake out if the number is bigger than the maximum positive 32 bit signed int. However, POSIX::setgid() works fine to set both the real and effective gid with one call and it accepts 4294967294. So I suggest that we replace $) = "$ugid $ugid"; # effective gid $( = $ugid; # real gid with setgid($ugid); I'm not expert in this stuff... Does anyone see any problem with using POSIX::setgid here on other platforms? ------- Additional Comment #13 From Justin Mason 2007-06-15 03:04 [reply] ------- (In reply to comment #12) > More digging: The underlying variables in the C library are unsigned ints. So > getpwnam returns what is there which looks like 4294967294 even though Apple > puts a -2 in /etc/passwd. Assignments to $< and $> are fine with either > 4294967294 or -2, doing the same thing in either case. It is only assignments to > $( and $) that flake out if the number is bigger than the maximum positive 32 > bit signed int. this sounds like a perl bug on MacOS X (another one!! what are they doing over there?) > However, POSIX::setgid() works fine to set both the real and effective gid with > one call and it accepts 4294967294. > > So I suggest that we replace > > $) = "$ugid $ugid"; # effective gid > $( = $ugid; # real gid > > with > setgid($ugid); > > I'm not expert in this stuff... Does anyone see any problem with using > POSIX::setgid here on other platforms? it may not be implemented, for one, which would cause a die(). We'd have to protect it with an eval ' ... ' block. you'd probably have to do sth like this: - run the $) / $( assignments in an eval '...' block - capture errors - if error =~ /setrgid\(\) not implemented/, then - run POSIX::setgid(...) - die if that failed too however would it be simpler to just do (pseudocode) if (platform eq MacOS && ($uid = getpwnam($user)) == 4294967294) { $uid = -2; } if that works, that would be better than attempting to use another syscall entirely, IMO. Also I suggest that should be opened as a totally separate bug; it's not caused by this one, it's just being _exposed_ by the fix.
btw, I've added "t/root_spamd_virtual.t" as a test case for this. Sidney, could you verify that fails for you?
Yes, it fails on the machine that I've been testing this on, which is an Intel MacBook running Apple's perl 5.8.6. I've started a test on a PPC iMac that has ActiveState's perl 5.8.8 and will know in a few minutes if that does anything different.
Copied from bug 5510 comment 18 ------- Additional Comment From Mark Martinec 2007-06-15 08:09 [reply] ------- (In reply to comment #13) Assignments to $< and $> are fine with either 4294967294 or -2 [...] assignments to $( and $) that flake out if the number is bigger than the maximum positive 32 bit signed int. > > this sounds like a perl bug on MacOS X > However, POSIX::setgid() works fine to set both the real > and effective gid with > you'd probably have to do sth like this: > - run the $) / $( assignments in an eval '...' block > - capture errors > - if error =~ /setrgid\(\) not implemented/, then > - run POSIX::setgid(...) > - die if that failed too If POSIX::setgid and POSIX::setuid are available (and I can't think why they wouldn't be), by all means, use ONLY these, to avoid all the $<, $>, $( and $) perl crap, which is causing all kinds of inconsistencies on various platforms. The Net::Server (which is used by amavisd-new) had a series of problem-reports until it finally switched to POSIX::setgid and POSIX::setuid. For illustration (not that I claim it is best), this is what currently stands in Net::Server::Daemonize.pm: sub set_uid { my $uid = get_uid( shift() ); POSIX::setuid($uid); if ($< != $uid || $> != $uid) { # check $> also (rt #21262) $< = $> = $uid; # try again-needed by some 5.8.0 linux systems (rt #13450) if ($< != $uid) { die "Couldn't become uid \"$uid\": $!\n"; } } return 1; } sub set_gid { my $gids = get_gid( @_ ); my $gid = (split /\s+/, $gids)[0]; eval { $) = $gids }; # store all the gids - this is really sort of optional POSIX::setgid($gid); if (! grep {$gid == $_} split /\s+/, $() { # look for any valid id in list die "Couldn't become gid \"$gid\": $!\n"; } return 1; }
Created attachment 3993 [details] Simple patch implementing Justin's suggested pseudocode Martin, I agree in principle, but I would like to hear if anyone knows of any issues with POSIX::setgid on any other platforms such as Windows. In the meantime, here is a very minimal patch along the lines that Justin suggested. It has the advantage of being very conservative and safe, in that it doesn't change what we have been doing on any platform other than MacOS. I propose we check this in now and then can take some time to investigate if POSIX::setgid does work ok on all the platforms that spamd runs on. I'll wait to hear feedback before I check this into trunk and put this in review status for 3.2 branch.
I wonder... if Net::Server is using POSIX::setuid and POSIX::setgid, maybe that's the way to go, since they'll have blazed the trail and hit all the bugs for us ;) Another thing: it should be possible to ensure we have a good set of test cases, now that we have the "t/root*" run-as-root test suite.
> I wonder... if Net::Server is using POSIX::setuid and POSIX::setgid, > maybe that's the way to go I wouldn't argue with that. If we switch to POSIX::setuid too, that means we can get rid of the bug 3900 code in that same part of spamd.raw and in Mail::SA::Util.pm?
> > I wonder... if Net::Server is using POSIX::setuid and POSIX::setgid, > > maybe that's the way to go > > If we switch to POSIX::setuid too, that means we can get rid of > the bug 3900 code in that same part of spamd.raw and in > Mail::SA::Util.pm? I have no idea how that behaves on Windows (Net::Server doesn't run there I think). I'm quite familiar with the issues we had with setting UID: since Perl can't assign to both of $< and $> in one atomic operation, one or the other order of assignment failed on some platform, the only universal solution was POSIX::setuid, which does both in one go. About setting groups I don't think these were as troublesome as setting uid. I think POSIX::setgid is used now mostly because POSIX::setuid needs to be used anyhow.
For reference: http://rt.cpan.org/Public/Bug/Display.html?id=13450 http://rt.cpan.org/Public/Bug/Display.html?id=21262
(In reply to comment #8) > > I wonder... if Net::Server is using POSIX::setuid and POSIX::setgid, > > maybe that's the way to go > > I wouldn't argue with that. If we switch to POSIX::setuid too, that means we can > get rid of the bug 3900 code in that same part of spamd.raw and in > Mail::SA::Util.pm? well, we can certainly do some consolidation of that code; I'd be hesitant to remove too much of the sanity-checks though.
...also: http://rt.cpan.org/Public/Bug/Display.html?id=5450
...and this is what amavisd-new does, if it is allowed to change uid and gid by itself (otherwise Net::Server does it): $( = $gid; # real GID $) = "$gid $gid"; # effective GID POSIX::setuid($uid) or die "Can't setuid to $uid: $!"; $> = $uid; $< = $uid; # just in case # print STDERR "desired user=$desired_user ($uid), current: EUID: $> ($<)\n"; # print STDERR "desired group=$desired_group, current: EGID: $) ($()\n"; $> != 0 or die "Still running as root, aborting\n"; $< != 0 or die "Effective UID changed, but Real UID is 0\n"; and later on, to make sure Net::Server did the right thing: my($euid) = $>; # effective UID $> = 0; # try to become root POSIX::setuid(0) if $> != 0; # and try some more if ($> == 0 || $euid == 0) { # succeded? panic! die ... }
> I'd be hesitant to > remove too much of the sanity-checks though Sanity checks are fine. I was thinking more about the strange code that's labeled as workaround for BSD peculiarities that can all be replaced with a call to POSIX::setuid. At least that's what I recall from when I looked at it -- right now I'm about to get on an airplane and can't take time for another close look. > this is what amavisd-new does, if it is allowed to > change uid and gid by itself > > $( = $gid; # real GID > $) = "$gid $gid"; # effective GID That's exactly what will break if done under MacOS as root to try to change the group to 'nobody' is $gid was set from getpwnam(), and what I want to replace with POSIX::setgid() if we don't kludge in something just in the case of MacOS to change the unsigned ints into signed ints.
> Sanity checks are fine. I was thinking more about the strange code that's > labeled as workaround for BSD peculiarities that can all be replaced with > a call to POSIX::setuid. Yes, that is my main point here. > > $( = $gid; # real GID > > $) = "$gid $gid"; # effective GID > That's exactly what will break if done under MacOS as root to try to > change the group to 'nobody' is $gid was set from getpwnam(), and what > I want to replace with POSIX::setgid() if we don't kludge in something I know, I'll have to apply the same fix for changing group as you do. Fortunately for me most users leave that task to Net::Server, which seems to get it right.
> I know, I'll have to apply the same fix for changing group as you do. > Fortunately for me most users leave that task to Net::Server, which > seems to get it right. ... and amavisd never runs as user or group nobody (it is supposed to have its dedicated username), so don't rely on my claims about setgid. My fighting with mills was all about setuid woes...
> My fighting with mills was all about setuid woes... The proper term would be 'a crusade' about setuid woes... I'm too chatty today, sorry, I'll step down now :)
Created attachment 3994 [details] Patch that uses POSIX::setgid and POSIX::setuid for portability Here's a patch that I composed on the airplane to Christchurch just now, so it is not as thoroughly tested as it might otherwise be. I'm off to a conference here and may not be all that responsive until I've made it through my presentation four days from now.
I did get to test it on my MacBook and it all looks fine. I'm not able to test this on other OSs since my virtual machines images are not with me. If I don't see any objections after some arbitrary amount of time I'll check this in to trunk and put it up for review for 3.2 branch. I would like to see someone verify it on a BSD system that demonstrated the problem of bug 3586 before we fixed it to make sure that this patch doesn't regress that bug.
> I would like to see someone verify it on a BSD system that > demonstrated the problem of bug 3586 before we fixed it > to make sure that this patch doesn't regress that bug. Looks good on: FreeBSD xxx FreeBSD 6.2-RELEASE-p5 #0: ... amd64 [82058] dbg: config: copying current conf to backup [82058] info: spamd: server started on port 783/tcp (running version 3.2.1) [82058] info: spamd: server pid: 82058 [82058] info: spamd: server successfully spawned child process, pid 82060 [82058] dbg: prefork: child 82060: entering state 0 [82058] dbg: prefork: new lowest idle kid: none [82060] dbg: prefork: sysread(8) not ready, wait max 300 secs [82058] info: spamd: server successfully spawned child process, pid 82061 [82058] dbg: prefork: child 82061: entering state 0 [82058] dbg: prefork: new lowest idle kid: none root 82058 58022 3.4 1.2 74720 48820 S+ 0:00.93 /usr/local/bin/perl -T -w /usr/local/bin/spamd --virtual-config-dir=/home/mark/00 -u nobody -D nobody 82060 82058 0.0 1.2 74728 48852 S+ 0:00.01 spamd child (perl) nobody 82061 82058 0.0 1.2 74728 48904 S+ 0:00.01 spamd child (perl) but it was fine before as well (nobody is 65534). If I revert the change in bug 3586 (on 3.2.1), it is again fine, but a workaround-in-use is reported: [87675] info: spamd: server successfully spawned child process, pid 87677 [87675] dbg: prefork: child 87677: entering state 0 [87675] dbg: prefork: new lowest idle kid: none [87677] dbg: spamd: initial attempt to change real uid failed, trying BSD workaround [87675] info: spamd: server successfully spawned child process, pid 87678 [87677] dbg: prefork: sysread(8) not ready, wait max 300 secs root 87675 87445 5.5 1.2 74716 48840 S+ 0:00.93 /usr/local/bin/perl -T -w /usr/local/bin/spamd --virtual-config-dir=/home/mark/00 -u nobody -D nobody 87677 87675 0.0 1.2 74720 48916 S+ 0:00.01 spamd child (perl) nobody 87678 87675 0.0 1.2 74720 48872 S+ 0:00.01 spamd child (perl) So I can say the new patch didn't break anything on FreeBSD.
Committed to trunk revision 547841. Please review and vote for committing to 3.2 branch.
trunk, "sudo prove -v t/root*" on 5.8.7 on linux x86: works fine trunk, !! on 5.8.8 on solaris 10: can't tell, because I just broke sudo ;) I'll retest once it's fixed. I'd also like to hear that the non-spamd parts work on win32... is anyone running spamd on win32? if so, we don't want to break that, hopefully. also, RUNNING_ON_MACOS seems superfluous btw.
perl 5.6.1 on linux x86: also ok
(In reply to comment #22) > trunk, !! on 5.8.8 on solaris 10: can't tell, because I just broke sudo ;) I'll > retest once it's fixed. yep, it all works (I had to fix an unrelated bug). +1
+1
Committed to branch 3.2 revision 553307.
Reopening to target to 3.1.10 I vote +1 to apply this patch to the 3.1 branch
Committed to branch 3.1 revision 558676.
don't release 3.1.10; this code breaks under perl 5.6.1 after all, it seems :( see bug 5574 for details.
Fixed in bug 5574