6251 – Temporarily Reduce DNSWL scores for 3.3.0 release

Bug 6251 - Temporarily Reduce DNSWL scores for 3.3.0 release

Summary: Temporarily Reduce DNSWL scores for 3.3.0 release

Status:	RESOLVED FIXED

Alias:	None

Product:	Spamassassin
Classification:	Unclassified
Component:	Rules (show other bugs)
Version:	3.3.0
Hardware:	Other All

Importance:	P1 normal
Target Milestone:	3.3.0
Assignee:	SpamAssassin Developer Mailing List

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2009-12-08 21:08 UTC by Warren Togami
Modified:	2009-12-15 13:42 UTC (History)
CC List:	6 users (show)

Attachment	Type	Modified	Status	Actions	Submitter/CLA Status
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Warren Togami 2009-12-08 21:08:36 UTC

Similar to Bug #6247 for Return Path, I believe we should reduce the scores of DNSWL before the 3.3.0 release.

http://ruleqa.spamassassin.org/20091205-r887515-n
Weekly masscheck for many weeks have been showing a minor but consistent amount of false positives in DNSWL medium and low. As a matter of practice I generally do not report whitelist violations or blacklist FP's because doing so would artificially bias the ruleqa statistics without solving underlying problems in listing and removal policy.

Current 50_scores.cf:
score RCVD_IN_DNSWL_LOW 0 -1 0 -1
score RCVD_IN_DNSWL_MED 0 -4 0 -4
score RCVD_IN_DNSWL_HI 0 -8 0 -8

My concerns:
* DNSWL has no obvious and easy methods on their site to report violations where spam was sent from a DNSWL listed host.
* Does DNSWL currently use spam traps as an automated method to detect whitelist violations? I don't know. Whatever their regular practices are, I strongly believe that automation is the only sustainable way for a service of nature to be maintainable in the long-term.
* GA rescoring when DNSWL was allowed to float assigned much lower scores than our fixed scores.

score RCVD_IN_DNSWL_LOW 0 -0.7 0 -0.7
score RCVD_IN_DNSWL_MED 0 -2.3 0 -2.3
score RCVD_IN_DNSWL_HI 0 -5 0 -5

I believe we should ship 3.3.0 with these as a reasonable score. After 3.3.0 release, if DNSWL clarifies the above concerns and the DNSWL statistics in ruleqa improve due to methodology (not one-off corpus cleaning) then I believe it would be fully appropriate to increase the scores in sa-update.

Comments?
Votes?

Comment 1 Benny Pedersen 2009-12-09 02:07:36 UTC

should this changed scores hide the problem more ?

with the old scores there is a bigger possible to see if a ip really is ham or spam, and from what i have seen in users@sa maillist there is a chance some dont like or understand what to do with there listnings that score there spam as ham

it should really just relist the ip in another level if the ip is spamming imho, not change scores to get it worse as i see it

problem is as i see it is that users just complain and take the fastest possible solution and forget what to do really :(

Comment 2 Justin Mason 2009-12-09 03:26:09 UTC

(In reply to comment #1)
> with the old scores there is a bigger possible to see if a ip really is ham or
> spam, and from what i have seen in users@sa maillist there is a chance some
> dont like or understand what to do with there listnings that score there spam
> as ham

From our POV, we want SA to compensate for individual rule FP/FNs, by allowing other more trustworthy rules to "take over".  so reducing the scores is the appropriate thing for us.  Whether or not an ip is misclassified by DNSWL can be determined by the sysadmin using grep, or us using ruleqa, etc.  it shouldn't have to cause problems for the user.

Comment 3 Justin Mason 2009-12-09 03:27:20 UTC

also, same thing applies as in bug 6247:

by the way -- I don't know if we can safely do this in the 3.3.0 release.  We
need to ensure the changes to these rules don't affect FP%/FN% rates far from
the levels were measured at during rescoring.  (This can happen if the new
rules hit diff mails, or the scores don't sufficiently compensate for FP/FNs
measured in the rescore mass-check.)  

So even if the rules overlap sufficiently close to 100% that we can just drop
'em in as replacements, a final, pre-release step for this bug will be to
measure the _new_ scores using the rescore mass-check's logfiles (with
s/HABEAS_ACCREDITED_COI/RCVD_IN_RP_CERTIFIED/g etc.), determine FP/FN%, and
make sure it matches/improves on the old rates (or is at least still acceptable).

Comment 4 Matthias Leisi 2009-12-09 04:58:09 UTC

> http://ruleqa.spamassassin.org/20091205-r887515-n
> Weekly masscheck for many weeks have been showing a minor but consistent amount
> of false positives in DNSWL medium and low.  As a matter of practice I

Are these actual false positives or due to trusted_networks etc. errors?  

> My concerns:
> * DNSWL has no obvious and easy methods on their site to report violations
> where spam was sent from a DNSWL listed host.

There is a feedback form and an email address. 

> * Does DNSWL currently use spam traps as an automated method to detect
> whitelist violations?  I don't know.  Whatever their regular practices are, I
> strongly believe that automation is the only sustainable way for a service of
> nature to be maintainable in the long-term.

dnswl.org is mostly based on manual processes (there are regular crosschecks with blacklists, extensive use of DNS logs etc and additional tools).

(Full Disclosure: I'm the dnswl.org project leader)

Comment 5 Warren Togami 2009-12-09 07:00:22 UTC

(In reply to comment #4)
> > My concerns:
> > * DNSWL has no obvious and easy methods on their site to report violations
> > where spam was sent from a DNSWL listed host.
> 
> There is a feedback form and an email address. 

http://www.dnswl.org/request.pl
Is this the feedback form?  It really needs to be more obvious how you are expected to use this to report violations.

> 
> > * Does DNSWL currently use spam traps as an automated method to detect
> > whitelist violations?  I don't know.  Whatever their regular practices are, I
> > strongly believe that automation is the only sustainable way for a service of
> > nature to be maintainable in the long-term.
> 
> dnswl.org is mostly based on manual processes (there are regular crosschecks
> with blacklists, extensive use of DNS logs etc and additional tools).

Are the cross-checks automated?
Are people notified if their listing gets demoted?

Comment 6 Kevin A. McGrail 2009-12-09 07:05:07 UTC

> dnswl.org is mostly based on manual processes (there are regular crosschecks
> with blacklists, extensive use of DNS logs etc and additional tools).
> 
> (Full Disclosure: I'm the dnswl.org project leader)

I don't have a problem with manual processes for the record.  I also don't necessarily believe our rules have to be based on any criteria such as email response, etc.

This is because I agree completely with Justin that SA's entire framework is designed to deal with FPs/FNs by weighted scoring.

Lowering the scores and running tests would be the appropriate action from my perspective.  Additionally, the scores suggested seem good.

Therefore, I am +1 on changing the DNSWL scores to:

score RCVD_IN_DNSWL_LOW 0 -0.7 0 -0.7
score RCVD_IN_DNSWL_MED 0 -2.3 0 -2.3
score RCVD_IN_DNSWL_HI 0 -5 0 -5

KAM

Comment 7 AXB 2009-12-09 07:09:04 UTC

I am +1 on changing the DNSWL scores to:

score RCVD_IN_DNSWL_LOW 0 -0.7 0 -0.7
score RCVD_IN_DNSWL_MED 0 -2.3 0 -2.3
score RCVD_IN_DNSWL_HI 0 -5 0 -5

Comment 8 Warren Togami 2009-12-09 07:10:40 UTC

Let me reiterate that I want the DNSWL scores to be adjusted higher in sa-update sometime later after methodologies as noted are improved.

Comment 9 Kevin A. McGrail 2009-12-09 07:20:15 UTC

(In reply to comment #8)
> Let me reiterate that I want the DNSWL scores to be adjusted higher in
> sa-update sometime later after methodologies as noted are improved.

I would expect the scores of all rules to be continuously adjusted, removed, etc. through sa-update.

However, I don't believe SA is officially requiring changes in methodologies for RBLs at the level you are suggesting.  

In short, the results of the test should stand on their own.  This may seem a bit overbroad and I'm open to suggestions but there are plenty of RBLs that don't accept comments or respond.

I would, of course, question the ones that charge money for delisting.  That should likely be an automatic killer for inclusion.

Comment 10 Warren Togami 2009-12-09 07:31:57 UTC

OK right, in general you are correct.

But DNSWL in this case is a collaborative community project.  This is only a suggestion to improve their methodologies in such a way that would make us feel more comfortable to assign a greater weight to their rules.

Matthias Leisi, I want to state my appreciation I have for your team's work.  Please know that we are not singling out DNSWL.  We are similarly reducing scores for the other whitelists in Bug #6247.

Comment 11 Kevin A. McGrail 2009-12-09 07:46:11 UTC

> But DNSWL in this case is a collaborative community project.  This is only a
> suggestion to improve their methodologies in such a way that would make us feel
> more comfortable to assign a greater weight to their rules.

My $0.02: even if they deliver roses and give dark chocolates or send burning dog poo to someone who sends them an email, the scoring is going to come down to the results.  Warm and fuzzies are more just for the initial "guess" for throwing the rule into the mix.

KAM

Comment 12 Warren Togami 2009-12-15 13:29:31 UTC

Current 50_scores.cf:
score RCVD_IN_DNSWL_LOW 0 -1 0 -1
score RCVD_IN_DNSWL_MED 0 -4 0 -4
score RCVD_IN_DNSWL_HI 0 -8 0 -8

# SUMMARY for threshold 5.0:
# Correctly non-spam: 703455  99.87%
# Correctly spam:     2551162  97.96%
# False positives:       911  0.13%
# False negatives:     53158  2.04%
# TCR(l=50): 26.384082  SpamRecall: 97.959%  SpamPrec: 99.964%

Replacement 50_scores.cf:
score RCVD_IN_DNSWL_LOW 0 -0.7 0 -0.7
score RCVD_IN_DNSWL_MED 0 -2.3 0 -2.3
score RCVD_IN_DNSWL_HI 0 -5 0 -5

# SUMMARY for threshold 5.0:
# Correctly non-spam: 703424  99.87%
# Correctly spam:     2551604  97.98%
# False positives:       942  0.13%
# False negatives:     52716  2.02%
# TCR(l=50): 26.091208  SpamRecall: 97.976%  SpamPrec: 99.963%

This makes for a negligible difference in the mcsnapshot results and seems clearly safe in my book.  I am committing and closing this bug.

Comment 13 Warren Togami 2009-12-15 13:40:11 UTC

Sending        rules/50_scores.cf
Transmitting file data .
Committed revision 891001.

Comment 14 Warren Togami 2009-12-15 13:42:43 UTC

This is a comment to satisfy Bugzilla's demand.