SA Bugzilla – Bug 6526
Disable rfc-ignorant.org
Last modified: 2014-02-02 00:18:07 UTC
Bug #6525 proposes disabling NJABL in the default ruleset because it is the poorest performing DNSBL. Its extremely tiny hit rate is not worth the cost of a network query. This is to open discussion of the second poorest performing DNSBL, rfc-ignorant.org. http://ruleqa.spamassassin.org/20101225-r1052760-n/DNS_FROM_RFC_DSN/detail 50_scores.cf:score DNS_FROM_RFC_DSN 0 0.001 0 0.001 # n=0 n=2 Poor hit rate, and we apparently don't give it a score because it is unsafe. http://ruleqa.spamassassin.org/20101225-r1052760-n/DNS_FROM_RFC_BOGUSMX/detail 50_scores.cf:score DNS_FROM_RFC_BOGUSMX 0 1.464 0 1.668 # n=0 n=2 Very poor hit rate, but we do give it a score. I believe we should disable this in the default rules because this very poor hit rate and score addition is not worth the extra network query. This rule should be an option for sysadmins to enable manually if they really want it. This proposed change would have very little impact on spamassassin's outcome, but it would free up a network query that could be used for something else more productive. Discuss?
I would support disabling the dsn rule by default. What fp rate are you seeing on the bogus mx list? Regards, KAM
Sigh. Please read the link above.
(In reply to comment #2) > Sigh. Please read the link above. I mean on your system. The corpora don't contain enough ham usually for me to guage fps.
(In reply to comment #3) > (In reply to comment #2) > > Sigh. Please read the link above. > > I mean on your system. The corpora don't contain enough ham usually for me to > guage fps. +1 for removal
Quick and dirty look... during the last week of my logs. 20633 mail scanned by spamassassin 272 hits on DNS_FROM_RFC_BOGUSMX 6 FP's
Are you certain you want to do this? One of the larger "free mailbox" providers is actually listed: <dns:yahoo.com.fulldom.rfc-ignorant.org> [127.0.0.4, 127.0.0.3] Yahoo is listed for reasons valid to the list (abuse and postmaster sublists). Considering it is also one of the most forged domains in mailbox addresses, ....
> Yahoo is listed for reasons valid to the list (abuse and postmaster sublists). > Considering it is also one of the most forged domains in mailbox addresses, Spamassassin does not score this. I am suggesting removal from default rules because it is rather useless in determining ham vs. spam and it is taking up a network query on every mail scanned. If you want to continue to uselessly use rfc-ignorant.org, (or dangerously enable it with scores), you as the sysadmin may enable these rules manually.
http://www.sdsc.edu/~jeff/spam/cbc.html SDSC's weekly statistics agreeing with our own statistics. Both NJABL and rfc-ignorant.org are catching 1% or less of spam.
To recap: We have only two enabled rules: DNS_FROM_RFC_DSN and DNS_FROM_RFC_BOGUSMX. The others are fully disabled in our ruleset. BOGUSMX has a production score, but problems... - very redundant: 0.7% of BOGUSMX hits score 5 points or below 40/474K low scoring spam detected by BOGUSMX - 205/258K ham FP on BOGUSMX - FP's far outnumber the cases where BOGUSMX would identify low scoring spam. DSN is informational because it has been unsafe. Recent masschecks show 1.6% spam but 0.5% ham. Spam hits are similarly entirely high scoring. These two rules are making effectively zero impact on spam determination because they hit so rarely and are already high scoring. The FP rate of BOGUSMX proves that it is actually doing more harm than good. For these reasons I propose making rfc-ignorant.org "tflags nopublish" for trunk. Any further comments?
(In reply to comment #9) > For these reasons I propose making rfc-ignorant.org "tflags nopublish" for > trunk. > > Any further comments? I think it would be better to score it 0.0 to allow an administrator to enable it at their option, yes? But I am +1 to disable it by default. KAM
(In reply to comment #10) > (In reply to comment #9) > > > For these reasons I propose making rfc-ignorant.org "tflags nopublish" for > > trunk. > > > > Any further comments? > > I think it would be better to score it 0.0 to allow an administrator to enable > it at their option, yes? > > But I am +1 to disable it by default. > > KAM +1 to disable (not remove)
+1 to disable, but not remove.
Does "disable (not remove)" mean "tflags nopublish" so it remains in masscheck but not production?
(In reply to comment #13) > Does "disable (not remove)" mean "tflags nopublish" so it remains in masscheck > but not production? My thoughts are score 0.0, remove it from being masschecked, and change the documentation perhaps in a .pre file that it's disabled with commented scores to re-enable it.
Hmm, could you go ahead so I can learn from your example? I am not 100% sure what you mean.
I agree with scoring it as zero (as opposed to "hard" removal) at this time, as these are rules that do occasionally fire. Perhaps there should be a standard proceedure to deal with rules deemed of little to no effect. Two steps: 1) Change score to zero (thus deactivating it by default) after discussion, etc. 2) Some time later (6 months to a year), remove it from the published rulesets, perhaps after a second discussion. OK, so that's a bit bureaucratic, but it gives people a chance to comment and for those who choose to continue its use, an opportunity to move it to their local rulesets (and out of the common, published rules). In the specific case of the "bogus MX" rule, is it possible that the low firing may be because other code in your MTA (e.g. at the MAIL FROM stage) is checking and rejecting sending sources with bogus MX records before SA ever sees the message? Such is the case at my system, but that check only considers the sender, not other fields such as intermediate relays, any "Reply-To:" address, etc. I just want to confirm that your (Warren's) low hit rate isn't due to another check performed before SA ever sees the message.
(In reply to comment #15) > Hmm, could you go ahead so I can learn from your example? I am not 100% sure > what you mean. I'm not sure it's ever been done. From looking, it appears scores like 0.001 were used which doesn't achieve the goal of saving the query. So my belief is we need to disable these rules in 20_dnsbl_tests.cf by going into 50_scores.cf and setting the scores to 0.0 for these rules. 20_dnsbl_tests.cf:header __RFC_IGNORANT_ENVFROM eval:check_rbl_envfrom('rfci_envfrom', 'fulldom.rfc-ignorant.org.') 20_dnsbl_tests.cf:header DNS_FROM_RFC_DSN eval:check_rbl_sub('rfci_envfrom', '127.0.0.2') 20_dnsbl_tests.cf:header DNS_FROM_RFC_BOGUSMX eval:check_rbl_sub('rfci_envfrom', '127.0.0.8') 20_dnsbl_tests.cf:header __DNS_FROM_RFC_POST eval:check_rbl_sub('rfci_envfrom', '127.0.0.3') 20_dnsbl_tests.cf:header __DNS_FROM_RFC_ABUSE eval:check_rbl_sub('rfci_envfrom', '127.0.0.4') 20_dnsbl_tests.cf:header __DNS_FROM_RFC_WHOIS eval:check_rbl_sub('rfci_envfrom', '127.0.0.5') I would then cut and paste the existing score lines from there to a new file v332.pre, comment them and write a note that these were disabled by default in the core rules due to poor performance and to enable them, uncomment the scores. Does a score of 0.0 make it so that mass check doesn't test the rule? Finally, 72_active.cf has a meta rule RFC_ABUSE_POST that also should be scored to 0 and added to the pre file. Anyone know a better procedure?
Is it really necessary to keep a proven ineffective rule in .pre forever? Nothing stops a user from copying the entire rule from an archive we keep elsewhere of retired rule examples.
(In reply to comment #18) > Is it really necessary to keep a proven ineffective rule in .pre forever? > > Nothing stops a user from copying the entire rule from an archive we keep > elsewhere of retired rule examples. I prefer to keep things in place so users are empowered to enable them by default. I expect we may have more RBLs that fall under this category for example with query limits more so than performance issues.
Hmm, I just realized that disabling this rule in production effectively ends our ability to measure it accurately in masscheck due to "reuse". So we must decide all or nothing. If it is disabled in production, there is no point in having it in masscheck anymore as the results are assured to be incorrect. I disagree that it is helpful to move it to a .pre file but I agree that is a reasonable compromise if you folks don't want it completely removed.
(In reply to comment #20) > Hmm, I just realized that disabling this rule in production effectively ends > our ability to measure it accurately in masscheck due to "reuse". So we must > decide all or nothing. If it is disabled in production, there is no point in > having it in masscheck anymore as the results are assured to be incorrect. That is expected on my part. Should we want to mass check it again, it should be added to your sandbox as new rules with a tflags nopublish would be my best guess. This is assuming that score 0.0 negates masschecking. Regards, KAM
> This is assuming that score 0.0 negates masschecking. I think that isn't the case? I think... just go ahead and make your desired change and let's see what happens in tonight's masscheck.
(In reply to comment #17) > (In reply to comment #15) > > Hmm, could you go ahead so I can learn from your example? I am not 100% sure > > what you mean. > > I'm not sure it's ever been done. From looking, it appears scores like 0.001 > were used which doesn't achieve the goal of saving the query. That's for "informative" rules (like my LOTS_OF_MONEY) where you want the recipient to see the hit but that by themselves shouldn't affect the score. > So my belief is we need to disable these rules in 20_dnsbl_tests.cf by going > into 50_scores.cf and setting the scores to 0.0 for these rules. Correct. > Finally, 72_active.cf has a meta rule RFC_ABUSE_POST that also should be scored > to 0 and added to the pre file. That's mine, and it's a meta of two base RFC rules. If they're scored zero it won't fire. If we do completely remove the base RFC rules, I'll pull that one too.
Created attachment 4868 [details] Set scores to 0, move old scores to rules/v332.pre I think this patch matches consensus. Sets the scores for DNS_FROM_RFC_DSN and DNS_FROM_RFC_BOGUSMX to 0, copies old scores to v332.pre. I don't think there was consensus for completely removing the rules. John Hardin still needs to set his RFC_ABUSE_POST to 0, as he said he would. The diff looks a little weird for v332.pre - I did a svn cp from v330.pre then edited it.
(In reply to comment #24) > > John Hardin still needs to set his RFC_ABUSE_POST to 0, as he said he would. $ svn commit -m "Support for bug6526" Sending jhardin/20_misc_testing.cf Transmitting file data . Committed revision 1099578.
sad to see it happend
(In reply to comment #26) > sad to see it happend You are welcome to manually enable rules like this, if you choose to use rules that make almost no measurable difference.
Looks like this has the necessary votes (for both trunk and 3.3.2) and just needs to be committed and closed?
(In reply to comment #28) > Looks like this has the necessary votes (for both trunk and 3.3.2) and just > needs to be committed and closed? Correct. Doing this now.
Sorry, took me a while to figure out the patch was based on copying v330.pre to v332.pre but got it all resolved. 3.3 branch Sending MANIFEST Sending rules/50_scores.cf Transmitting file data .. Committed revision 1100355. and Adding rules/v332.pre Transmitting file data . Committed revision 1100357. trunk Sending MANIFEST Sending rules/50_scores.cf Adding rules/v332.pre Transmitting file data ... Committed revision 1100362.
Sorry, I'm really curious why svn cp, modification, then svn diff didn't result in a patch that applied without problems. I was hoping that as weird as the patch looked, it would work for some reason I was missing.
It's no worries. I didn't know to cp v330.pre to v332.pre. Assumed your patch created that file. Also realized I needed to add v332.pre to MANIFEST.
(In reply to comment #26) > sad to see it happend Shortly after I submitted the patch, I felt somewhat remorseful. I *like* when people who violate RFCs are penalized. But then I read into the details of what exact RFC violations were being penalized, and lost all my remorse. They really don't seem worth it. They're explained in detail here: http://www.rfc-ignorant.org/policy-dsn.php - for DNS_FROM_RFC_DSN http://www.rfc-ignorant.org/policy-bogusmx.php - for DNS_FROM_RFC_BOGUSMX
(In reply to Bug 6490 comment #35) (Michael Parker, 2011-05-11 15:44:31 UTC) > You shouldn't add rules, even disabled, to .pre files. I fully agree. A .pre file is for plugins and possibly for fixing some other anomaly, not for rules or scores. As there is nothing else besides a commented score there, and no changes in plugin loading needed for 3.3.2, I don't think we need the v332.pre at all. Suggesting to remove the v332.pre. Reopening to sort out this concern.
(In reply to comment #34) > (In reply to Bug 6490 comment #35) > (Michael Parker, 2011-05-11 15:44:31 UTC) > > You shouldn't add rules, even disabled, to .pre files. > > I fully agree. A .pre file is for plugins and possibly for fixing > some other anomaly, not for rules or scores. As there is nothing > else besides a commented score there, and no changes in plugin > loading needed for 3.3.2, I don't think we need the v332.pre at all. > > Suggesting to remove the v332.pre. > > Reopening to sort out this concern. This is a blocker to releasing 3.32 IMO. In hindsight, you are correct. I didn't think about it when applying the patch. Is there any existing cf convention to follow for this? My thoughts are to create a v332.cf with an opening comment similar to v332.pre but reflecting rules rather than plugins. Or to add this to the bottom of local.cf.
(In reply to comment #35) > > Suggesting to remove the v332.pre. > > > > Reopening to sort out this concern. > > This is a blocker to releasing 3.32 IMO. In hindsight, you are correct. I > didn't think about it when applying the patch. > > Is there any existing cf convention to follow for this? > > My thoughts are to create a v332.cf with an opening comment similar to v332.pre > but reflecting rules rather than plugins. Or to add this to the bottom of > local.cf. I don't see any legitimate reason to keep this in any form in spamassassin. I think we should just delete the .pre file as Mark suggests.
(In reply to comment #36) > (In reply to comment #35) > > > Suggesting to remove the v332.pre. > > > > > > Reopening to sort out this concern. > > > > This is a blocker to releasing 3.32 IMO. In hindsight, you are correct. I > > didn't think about it when applying the patch. > > > > Is there any existing cf convention to follow for this? > > > > My thoughts are to create a v332.cf with an opening comment similar to v332.pre > > but reflecting rules rather than plugins. Or to add this to the bottom of > > local.cf. > > I don't see any legitimate reason to keep this in any form in spamassassin. I > think we should just delete the .pre file as Mark suggests. I disagree but I'm not voting -1. I'd prefer the commented rule stays there for administrators to make an informed decision.
(In reply to comment #37) > I disagree but I'm not voting -1. I'd prefer the commented rule stays there for > administrators to make an informed decision. But is it really an *informed* decision? Many folks think using this DNSBL is a good idea, until a hard look at the statistics demonstrates this to be false.
Why not just put it into 50_scores.cf where I think it belongs?
(In reply to comment #39) > Why not just put it into 50_scores.cf where I think it belongs? Wouldn't that keep it in masschecks? That is a non-goal.
This subject has gotten far more attention today than it's worth. I'm comfortable with whatever gets it closed. Maybe put it back in rules/50_scores.cf commented out?
(In reply to comment #41) > This subject has gotten far more attention today than it's worth. I'm > comfortable with whatever gets it closed. Maybe put it back in > rules/50_scores.cf commented out? This is the worst of all options. Commenting out only the score would only disable the rule from production without removing it from masscheck. Keeping it in masscheck in any form gives false results that are of no use.
Pre Files removed for v332 and v340 and added to local.cf for consideration in trunk. Removing v332.pre in 3.3 branch as that's clearly wrong. 3.3 branch: Deleting rules/v332.pre Committed revision 1102298. Trunk: Sending rules/local.cf Deleting rules/v332.pre Deleting rules/v340.pre Transmitting file data . Committed revision 1102296. Rules commented out in local.cf in trunk. This makes the issue "solved" for 3.3.2 and we can discuss figure out if we want commented out rules for 3.4.0. Regards, KAM
Close. Already committed to trunk and 3.3. Unless you really want to "...discuss figure out if we want commented out rules for 3.4.0."
fixed
This needs to stay on radar because there is commented code in trunk that needs to be removed or changed.
(In reply to comment #46) > This needs to stay on radar because there is commented code in trunk that needs > to be removed or changed. Should it be removed or changed?
(In reply to comment #47) > (In reply to comment #46) > > This needs to stay on radar because there is commented code in trunk that needs > > to be removed or changed. > > Should it be removed or changed? I'm +1 for nuking any/all reference to rfc-ignorant.org and turn it into a wiki post for those who must have the extra pointless BL lookup.
(In reply to comment #48) > (In reply to comment #47) > > (In reply to comment #46) > > > This needs to stay on radar because there is commented code in trunk that needs > > > to be removed or changed. > > > > Should it be removed or changed? > > I'm +1 for nuking any/all reference to rfc-ignorant.org and turn it into a wiki > post for those who must have the extra pointless BL lookup. That's a great idea. How about a generic wiki that can list any rules or things that admins might consider? +1
(In reply to comment #49) > (In reply to comment #48) > > I'm +1 for nuking any/all reference to rfc-ignorant.org and turn it into a wiki > > post for those who must have the extra pointless BL lookup. > > That's a great idea. How about a generic wiki that can list any rules or > things that admins might consider? +1 +1, as this is pretty similar to what I've been suggesting for months. =)
(In reply to comment #50) > (In reply to comment #49) > > (In reply to comment #48) > > > I'm +1 for nuking any/all reference to rfc-ignorant.org and turn it into a wiki > > > post for those who must have the extra pointless BL lookup. > > > > That's a great idea. How about a generic wiki that can list any rules or > > things that admins might consider? +1 > > +1, as this is pretty similar to what I've been suggesting for months. =) I'm sorry but I must have missed that. I had you in the "delete it, pretend it never existed and disavow all knowledge of said rule" category. I'll work on this after the 3.3.2 release.
(In reply to comment #51) > I'm sorry but I must have missed that. I had you in the "delete it, pretend it > never existed and disavow all knowledge of said rule" category. I thought "document it elsewhere" and I thought I communicated that on list. In any case, no need to beat a dead horse. I'm glad this is the outcome.
Perhaps there should be a file in the ruleset distribution not ending with .cf or .pre (therefore not actively used) which accumulates recently removed rules, especially those which might be deemed useful for the overly paranoid?
There's already a place for rules which aren't included with SA: http://wiki.apache.org/spamassassin/CustomRulesets
Custom rulesets are for EXTRA rules. My suggestion is for REMOVED rules which were present in the SA official channel, and a way to alert people of rules' removal in a way which allows them to save a copy should they want to use it locally even after its removal. I leave the expiration period (suggested minimum of 1 month) for the discussion (and suggest it only to prevent an infinently growing file). Simply removing a rule then telling people its gone doesn't give them a chance to save it locally. Some people might not be smart enough to write their own rules.
(In reply to comment #54) > There's already a place for rules which aren't included with SA: > > http://wiki.apache.org/spamassassin/CustomRulesets I think that's more for rules that never were a part of SA, though. My thoughts are most along the lines of D.Stussy's comments in #53 but a new wiki page is fine too.
(In reply to comment #56) > (In reply to comment #54) > > There's already a place for rules which aren't included with SA: > > > > http://wiki.apache.org/spamassassin/CustomRulesets > > I think that's more for rules that never were a part of SA, though. My > thoughts are most along the lines of D.Stussy's comments in #53 but a new wiki > page is fine too. Wiki page, please.
Created attachment 4926 [details] Remove rules from trunk's local.cf Created http://wiki.apache.org/spamassassin/RemovedRulesets with the stuff from RFC-Ignorant. Linked from http://wiki.apache.org/spamassassin/CustomRulesets . This patch removes it from trunk's rules/local.cf (comment 43). This bug can be closed as soon as that's committed. The wiki asking me questions I need to google every time I edit it is pissing me off: "Who is the current President of the ASF (full name)?" "ApacheCon Europe 2007 was hosted in which city?" Can this be fixed?
(In reply to comment #58) > Created attachment 4926 [details] > Remove rules from trunk's local.cf > > Created http://wiki.apache.org/spamassassin/RemovedRulesets with the stuff from > RFC-Ignorant. Linked from http://wiki.apache.org/spamassassin/CustomRulesets . > > This patch removes it from trunk's rules/local.cf (comment 43). This bug can > be closed as soon as that's committed. > > The wiki asking me questions I need to google every time I edit it is pissing > me off: > "Who is the current President of the ASF (full name)?" > "ApacheCon Europe 2007 was hosted in which city?" > Can this be fixed? (In reply to comment #58) > Created attachment 4926 [details] > Remove rules from trunk's local.cf > > Created http://wiki.apache.org/spamassassin/RemovedRulesets with the stuff from > RFC-Ignorant. Linked from http://wiki.apache.org/spamassassin/CustomRulesets . > > This patch removes it from trunk's rules/local.cf (comment 43). This bug can > be closed as soon as that's committed. > > The wiki asking me questions I need to google every time I edit it is pissing > me off: > "Who is the current President of the ASF (full name)?" > "ApacheCon Europe 2007 was hosted in which city?" > Can this be fixed? Done. Can you repeat for bug 6490 also at the bottom of local.cf? svn commit -m 'remove RFC-Ignorant now that it is documented on http://wiki.apache.org/spamassassin/RemovedRulesets' Sending rules/local.cf Transmitting file data . Committed revision 1139012. re: wiki turing test, I can't recreate the issue. Please discuss on dev list.
Rfc-ignorant.org went down permanently on 30 November 2012. (See the announcement at <https://web.archive.org/web/20121013071621/http://rfc-ignorant.org/endofanera.php>.) Perhaps the "Removed Rulesets" wiki page could be updated to reflect this, just in case anyone is still using the rule and isn't sure why it's no longer working.
(In reply to Tristan Miller from comment #60) > Rfc-ignorant.org went down permanently on 30 November 2012. (See the > announcement at > <https://web.archive.org/web/20121013071621/http://rfc-ignorant.org/ > endofanera.php>.) Perhaps the "Removed Rulesets" wiki page could be updated > to reflect this, just in case anyone is still using the rule and isn't sure > why it's no longer working. see https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6836 2012-09-15
It appears that the .de TLD reincarnation of rfc-ignorant has been abandoned. Its domain registration has expired (currently in the 30 day redemption/purge pending period) and the project itself never populated a database. Domain: rfc-ignorant.de Status: redemptionPeriod Changed: 2014-01-22T09:40:04+01:00 Therefore, I see no reason to render its rules to the "removed" SA rule page. CC: Bug 6836.