SA Bugzilla – Bug 5751
sa-update breaks SA installation if GPG validation of the base ruleset fails on first run
Last modified: 2015-04-13 22:37:08 UTC
https://bugzilla.redhat.com/show_bug.cgi?id=392851 Several users are reporting that sa-update, with GPG signature check failing, left an incomplete rules directory. When spamassassin tries to use that directory later it fails and passes mail with no checks. Two bugs here? 1) sa-update should not leave a broken rules directory if it fails. I have no idea how people are getting broken directories though, because I have never seen this happen with Fedora's default packaging and I have never used sa-update --import myself. 2) spamassassin shouldn't attempt to use rules dir that is missing certain key files.
(In reply to comment #0) > 1) sa-update should not leave a broken rules directory if it fails. I have no > idea how people are getting broken directories though, because I have never seen > this happen with Fedora's default packaging and I have never used sa-update > --import myself. yeah, it's odd alright. could it be from deleting the sa-update keyring dir in /etc/mail/spamassassin/sa-update-keys?
The RPM or scripts are not touching or doing anything special with /etc/mail/spamassassin/sa-update-keys.
Targeting for 3.2.4
(In reply to comment #2) > The RPM or scripts are not touching or doing anything special with > /etc/mail/spamassassin/sa-update-keys. yeah -- I'm thinking a user might have accidentally deleted it...
I have added some more detailed analysis to the original bug report at https://bugzilla.redhat.com/show_bug.cgi?id=392851 A specific REPRODUCIBLE case of such error is given there. A more general reproducible error case is as follows: On a fresh install, run sa-update with two "--channel" inputs with one corresponding to the base ruleset (e.g., updates.spamassassin.org) but without a valid gpg key and the other (e.g., saupdates.onprotect.com) corresponding to some other supplementary ruleset but with a valid gpg key. The order doesn't seem to matter. Then, sa-update will happily create the /var/lib/spamassassin/3.002003 directory (using Fedora locations) and populate it with the rules from the channel with the valid gpg key while warning of a bad gpg key and failing to add the base ruleset for updates.spamassassin.org. The non-expert user likely won't see/understand/recognize the warning and will be left with a spamassassin configuration without any valid base rules since once the /var/lib/spamassassin/3.002003 directory is created, it overrides the other locations for the original non-updated rules (e.g., /usr/share/spamassassin) Not sure though what the best way to fix this is since technically sa-update is not doing anything wrong. It is only just creating a non-working situation that is obscure to most non-experts :)
(In reply to comment #5) > On a fresh install, run sa-update with two "--channel" inputs with one > corresponding to the base ruleset (e.g., updates.spamassassin.org) but without a > valid gpg key and the other (e.g., saupdates.onprotect.com) corresponding to > some other supplementary ruleset but with a valid gpg key. The order doesn't > seem to matter. > > Then, sa-update will happily create the /var/lib/spamassassin/3.002003 directory > (using Fedora locations) and populate it with the rules from the channel with > the valid gpg key while warning of a bad gpg key and failing to add the base > ruleset for updates.spamassassin.org. The non-expert user likely won't > see/understand/recognize the warning and will be left with a spamassassin > configuration without any valid base rules since once the > /var/lib/spamassassin/3.002003 directory is created, it overrides the other > locations for the original non-updated rules (e.g., /usr/share/spamassassin) > > Not sure though what the best way to fix this is since technically sa-update is > not doing anything wrong. It is only just creating a non-working situation that > is obscure to most non-experts :) ah. I see! it could probably also be reproduced by doing sa-update with just an additional ruleset channel, without the base ruleset's channel. that way, the /var/lib/spamassassin/3.002003 dir is created without any valid base ruleset.
(In reply to comment #5) > Then, sa-update will happily create the /var/lib/spamassassin/3.002003 directory > (using Fedora locations) and populate it with the rules from the channel with > the valid gpg key while warning of a bad gpg key and failing to add the base > ruleset for updates.spamassassin.org. The non-expert user likely won't > see/understand/recognize the warning and will be left with a spamassassin > configuration without any valid base rules since once the > /var/lib/spamassassin/3.002003 directory is created, it overrides the other > locations for the original non-updated rules (e.g., /usr/share/spamassassin) Yes, of course. That's how sa-update is supposed to work. updates.spamassassin.org isn't a required channel, nor should it be (there are use cases where people want to just use their own channels, for instance). I also don't think that the whole update should fail if any of the channels fail. http://wiki.apache.org/spamassassin/RuleUpdates has this covered as the very top item in the FAQ section, btw. For most people, if they're adding in a third party channel, they'll need to pay attention to the fact that they need the project's channel to install as well to get those rules. IMO, this would be addressed by us not including any rules w/ the standard distro and requiring all rules to come from the update system. It also fixes our issue whereby we have multiple areas for rules for the same SA version (ie: rules/branches/3.2 and branches/3.2/rules ...)
(In reply to comment #7) > Yes, of course. That's how sa-update is supposed to work. updates.spamassassin.org isn't a required > channel, nor should it be (there are use cases where people want to just use their own channels, for > instance). after seeing the uses of rule updates in the field, I'd say it'd be more appropriate if updates.spamassassin *was* a required channel, triggering additional paranoia about leaving empty rules dirs behind. We could possibly add a command-line switch to disable this error-checking, but IMO it's better than allowing PEBKAC errors so easily... > http://wiki.apache.org/spamassassin/RuleUpdates has this covered as the very top item in the FAQ > section, btw. is that linked from the *real* FAQ anywhere? > For most people, if they're adding in a third party channel, they'll need to pay attention to > the fact that they need the project's channel to install as well to get those rules. > IMO, this would be addressed by us not including any rules w/ the standard distro and requiring all > rules to come from the update system. It also fixes our issue whereby we have multiple areas for rules > for the same SA version (ie: rules/branches/3.2 and branches/3.2/rules ...) hmm.... might be worth considering this for 3.3.0. how's about opening a bug to discuss it?
(In reply to comment #8) > after seeing the uses of rule updates in the field, I'd say it'd be more > appropriate if updates.spamassassin *was* a required channel, triggering > additional paranoia about leaving empty rules dirs behind. We could possibly > add a command-line switch to disable this error-checking, but IMO it's better > than allowing PEBKAC errors so easily... I'm going to be pretty -1 about that. The whole goal we were going for was to separate engine from rules. Requiring that people get our rules is a full 180 from the original goal. I also think that the whole default rules dir/local state dir override thing is a bit of a kluge and leads to problems like this. Which gets me back to the "only make rules available via sa-update" idea, which should make everyone happy. :) > > http://wiki.apache.org/spamassassin/RuleUpdates has this covered as the very > top item in the FAQ > > section, btw. > > is that linked from the *real* FAQ anywhere? It is the real faq, for sa-update anyway. It's in the man page. :) I don't know if it's linked from the general SA FAQ. > > IMO, this would be addressed by us not including any rules w/ the standard > distro and requiring all > > rules to come from the update system. It also fixes our issue whereby we have > multiple areas for rules > > for the same SA version (ie: rules/branches/3.2 and branches/3.2/rules ...) > > hmm.... might be worth considering this for 3.3.0. how's about opening a bug to > discuss it? Sure. bug 5752. :)
Well, I think part of the problem in my case (and in the likely use case) is when someone has default rules installed in __def_rules_dir__ (in my case /usr/share/spamassassin) but then gets only a partial update installed in __local_state_dir__/spamassassin/__version__ (/var/lib/spamassassin/3.002003 in my case). In fact, even if just the directory __version__ (3.002003) is created then, suddenly all the rules in __def_rules_dir__ are overriden. This would occur even if no rules are added to the directory (or in my case the rules that were in 3.002003 all had errors because they were missing dependencies), resulting in effectively NO RULE CHECKING. Wouldn't a more forgiving approach be to first look in __version__ and then continue to look in __def_rules_dir__ if the subdirectory updates_spamassassin_org is not present. Then, additionally, in order to prevent incomplete updates, the repository subdirectories would not be added (or later changed) in __version__ if the repository update is not successful. This would ensure that __def_rules_dir would never be overriden unless a successful update occurred to updates_spamassassin_org (or more generally the user-defined default repository). To preserve separation of rules and engines, one could make a user-defined variable assign which repository is the primary rule set for updates. The default value of that variable would likely be updates_spamassassin_org So, in my case, the failure of updates_spamassassin_org to update would leave that directory missing from __version__ (as it was) so that spamassassin would continue to look also at __def_rules_dir__ for the base rule set. If you truly want to ignore all rules in __def_rules_dir__ then you could create an empty directory updates_spamassassin_org (or whatever your default repository is) so that spamassassin would think that the ruleset was updated successfully. There are many possible variations on the above ideas but I would think that it would be more robust than the current approach.
(In reply to comment #10) > Wouldn't a more forgiving approach be to first look in __version__ and then > continue to look in __def_rules_dir__ if the subdirectory > updates_spamassassin_org is not present. Except that not having the SA update channel is a valid situation. A simple fix here IMO is to skip the local_state_dir if there are no usable files in it. Then it would DTRT. Then for 3.3, there can be the whole discussion of requiring an update channel, etc.
(In reply to comment #11) > (In reply to comment #10) > > Wouldn't a more forgiving approach be to first look in __version__ and then > > continue to look in __def_rules_dir__ if the subdirectory > > updates_spamassassin_org is not present. > > Except that not having the SA update channel is a valid situation. A simple fix > here IMO is to skip the local_state_dir if there are no usable files in it. > Then it would DTRT. I don't think that fixes the issue in http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5751#c5 .
(In reply to comment #12) > I don't think that fixes the issue in > http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5751#c5 . Well, yes and no, and that's not what we were discussing. ;) The issues in question are: - if the local_state_dir is created, but nothing is in it, what do we do? Right now, if it exists, we use it. The current sa-update only creates the dir if it's about to extract channel files, and if that fails the directories get removed -- see bug 4941 and sa-update starting around line 726. So in short, sa-update should never cause this situation. If we are worried about this situation though, the way to work around it, IMO, is to skip it. - if the local_state_dir doesn't have updates.spamassassin.org in it, but has other channels in it, what do we do? Our design says: using updates.spamassassin.org is optional, so use whatever is available in the dir. I'm going to be absolutely -1 on changing this -- we have to allow people to not use our rules if they don't want to. We can highly recommend they use them, but it's up to the user, and larger organizations will want to be able to have their own channels available and not use ours. We have documented about adding channels and needing to use updates.spamassassin.org if they expect the standard ruleset. If we want to improve these docs to make it more obvious, great. - if one channel fails while we're updating multiple channels, what do we do? Right now, we let that channel fail and continue with the other channels. IMO, this is the right thing to do. It does mean that people need to pay attention if this is the first time they're running sa-update for a given channel, and handle failures appropriately. At the moment, I wouldn't be opposed, if people wanted to, have sa-update be more verbose if it's trying to get/install updates.spamassassin.org, and it fails, and other channels already exist or other channels are in the queue to be updated. It could probably be added w/out making too much of a mess (ie: track the pass/fail per channel, then at the end figure out the situation and warn appropriately -- hey, isn't this part of bug 5043?). :) As you may have guessed at this point, I'm not very sympathetic to people burning themselves when playing with fire. I think fire is important, and want to make sure that if people want to, they can play with it. I don't mind making sure the warning signs are clear about the dangers of doing so, however.
(In reply to comment #13) > As you may have guessed at this point, I'm not very sympathetic to people > burning themselves when playing with fire. I think fire is important, and want > to make sure that if people want to, they can play with it. I don't mind making > sure the warning signs are clear about the dangers of doing so, however. Yes but I think there is another equally valid point here which is that a failed (partial) update shouldn't leave you in a worse state. The first priority with failed updates should be "DO NO HARM" The problem in my mind is that even the most trivial addition to __version__ as part of a failed multi-part run of sa-update causes ALL the default rules in __def_rules_dir__ to be overriden leaving spamassassin in a broken state. IMHO, this is a rather unnatural result in that one would naively expect that updates would fail gracefully in that old rules would only be overriden to the extent that they are replaced (or intentionally not replaced). I think a better answer would be to have the default case of a partial update failure be that NONE of the updates occur unless you use a (new) command line flag --force. A failed partial update would then give a clear warning to the effect that "No update occurred since update xxx failed. If you want to allow only a partial update to proceed then use the --force flag" I think this "default do-no-harm" policy is much better than just adding more warnings. In my specific case, the sa-update was run from a root-only readable cron file that I didn't even realize was there (we can't all know everything about our systems :) and its ouput was buried deep in some log file so I never even knew a failed update occurred. The result is that without any warning I was left with an inoperable spamassassin setup. With my suggestion, the default would have been that no update occurred and at least spamassassin would still work. Also, in my particular case, even the update portion that did occur didn't help because those rules had dependencies on rules in updates.spamassassin.org so allowing a partial update truly had no benefit at all. IMHO, the desired behavior should be analogous to programs like 'rpm' where if you specify multiple rpms to install/upgrade then the installation/upgrade only occurs if ALL of them work unless you use the --nodeps or --force type flags to override the default behavior if you really know what you are doing. I'm not sure why anyone would object to this compromise solution since experts like yourself could still proceed at their own risk and allow partial updates to occur while less sophisticated users would at worse have a still working but not updated spamassassin installation.
> we have to allow people to not use our rules if they don't want to. We can > highly recommend they use them, but it's up to the user, and larger > organizations will want to be able to have their own channels available and > not use ours. Let me add my 2 cents worth here. I can understand the above requirement, but I would expect that not using your rules would be an unusual situation. Perhaps an extremely unusual situation. I think that not using your rules should require an explicit ok from the site manager. I propose that there should be a configuration option that must be set before not using your rules would be allowed. Without that option, the distributed rules would be used if none are found in the update directory. That way someone who wanted to use updated additional rules but use your distributed rules without updates could do so.
(In reply to comment #15) > > we have to allow people to not use our rules if they don't want to. We can > > highly recommend they use them, but it's up to the user, and larger > > organizations will want to be able to have their own channels available and > > not use ours. > > Let me add my 2 cents worth here. I can understand the above requirement, but > I would expect that not using your rules would be an unusual situation. > Perhaps an extremely unusual situation. I think that not using your rules > should require an explicit ok from the site manager. I propose that there > should be a configuration option that must be set before not using your rules > would be allowed. Without that option, the distributed rules would be used > if none are found in the update directory. That way someone who wanted to > use updated additional rules but use your distributed rules without updates > could do so. I really have to agree. I *do* agree that we need to support using SA (and sa-update) without requiring that the default ruleset be used -- but in this case, using sa-update to download thirdparty rulesets *without* also downloading updates to the default ruleset is a very unusual case; I doubt there's anyone in the world yet doing this. IMO, it'd make most sense if that was not permitted unless explicitly requested using a new "--no-default-rules" switch or similar.
(In reply to comment #16) > I *do* agree that we need to support using SA (and sa-update) without requiring > that the default ruleset be used -- but in this case, using sa-update to > download thirdparty rulesets *without* also downloading updates to the default > ruleset is a very unusual case; I doubt there's anyone in the world yet doing this. A very obvious use case is when people want to download channels w/ different frequency, and so run sa-update once for updates.spamassassin.org, and then again at a different time for some other channel. > IMO, it'd make most sense if that was not permitted unless explicitly requested > using a new "--no-default-rules" switch or similar. I'm against this idea for previously discussed reasons.
(In reply to comment #14) > Yes but I think there is another equally valid point here which is that a failed > (partial) update shouldn't leave you in a worse state. The first priority with > failed updates should be "DO NO HARM" And that's what sa-update already does. The issue here is that when running sa-update for a channel *the first time* when using multiple channels, people have to pay attention to whether there was a failure and if so, fix the issue and try again. I believe all other situations are handled automatically, because they can be. > I think a better answer would be to have the default case of a partial update > failure be that NONE of the updates occur unless you use a (new) command line > flag --force. If someone runs sa-update w/ different channels (for example so that updates occur at different frequencies for different channels), then you have the same problem and no commandline option or sa-update logic will be able to solve the problem. > I think this "default do-no-harm" policy is much better than just adding more > warnings. In my specific case, the sa-update was run from a root-only readable > cron file that I didn't even realize was there (we can't all know everything > about our systems :) and its ouput was buried deep in some log file so I never > even knew a failed update occurred. That's a system administration problem. > I'm not sure why anyone would object to this compromise solution since experts > like yourself could still proceed at their own risk and allow partial updates to > occur while less sophisticated users would at worse have a still working but not > updated spamassassin installation. IMO, bug 5193 (making sa-update more "atomic") would allow more for "fail a whole run if any part fails", if people want to go that way. I'm not opposed to that kind of change if people feel it's the better way to go. It does solve this specific situation, though not all possible similar situations, which I believe as previously described is dependent on the user.
Suggestion to maybe get around the impasse here: When SA is first installed (initially, or as an upgrade) a check is made to see if the updates directory exists. If not, it is created and the rules supplied with the package are copied into this directory. This should be done as part of the installation process, NOT as part of any normal SA activity. SA will no longer look in the original rules location for rules, only in the updates hierarchy. If the user does not use an SA update channel the original rules will remain functional, but SA will only have to look in one place for rules, not two places with an override on the first. If the user is using an SA update channel, obviously updated rules will override the package rules in the normal manner; it will simply be the case of a newer update replacing an older update. For the rare case of packagers of third party software that do not wish to use either the native SA rules nor SA rule updates, they only have to add a final installation step to remove the default SA rules from the update directory that were installed by an earlier installation step. Note that it might be simpler all around to simply re-home the package rules to the updates directory in the first place, and remove the code that copies from the current rules location to the updates directory. The one drawback I can see with doing this is that on an upgrade the package rules would/might override any current update SA rules in the updates directory. OTOH rules tend to change between releases, so this might be a good thing.
(In reply to comment #19) > SA will no longer look in the original rules location for rules, only in the > updates hierarchy. If the user does not use an SA update channel the original > rules will remain functional, but SA will only have to look in one place for > rules, not two places with an override on the first. If the user is using an > SA update channel, obviously updated rules will override the package rules in > the normal manner; it will simply be the case of a newer update replacing an > older update. yes, this is bug 5752. I agree we should do that for 3.3.0, but we still have this bug for 3.2.x.
moving on to 3.2.5, so 3.2.4 can be released
moving to 3.2.6 so that we can release a 3.2.5
Just a quick comment: In my SA logs, I sometimes find mentioning of syntax errors after running sa-update, even after restarting SA thereafter (do I need to?). The latest example is a problem with this rule: (in 80_additional.cf): @4000000048d6216f360ad4cc [15045] warn: syntax error at /var/db/spamassassin/3.002004/updates_spamassassin_org/80_additional.cf, rule __TVD_BODY, line 13, near "; @4000000048d6216f360d6cdc [15045] warn: }" @4000000048d6216f360ffd1c [15045] warn: /var/db/spamassassin/3.002004/updates_spamassassin_org/80_additional.cf, rule __TVD_BODY, has too many errors. There is only a comment on line 13. Help about how to debug such stuff would be great... This is SpamAssassin 3.2.4 on OpenBSD 4.3, the official package.
Re-targeting from 3.2.6 to 3.4.0 (current trunk).
Moving to 3.4.1 to allow for a 3.4.0 release.
Closing. sa-update being required to run has largely fixed this issue.