3998 – Add support for Habeas v2

Bug 3998 - Add support for Habeas v2

Summary: Add support for Habeas v2

Status:	RESOLVED FIXED

Alias:	None

Product:	Spamassassin
Classification:	Unclassified
Component:	Rules (Eval Tests) (show other bugs)
Version:	SVN Trunk (Latest Devel Version)
Hardware:	Other other

Importance:	P5 enhancement
Target Milestone:	3.0.4
Assignee:	SpamAssassin Developer Mailing List

URL:
Whiteboard:	can be applied if necessary
Keywords:

Depends on:
Blocks:	4182
	Show dependency tree

Reported:	2004-11-24 19:52 UTC by Theo Van Dinter
Modified:	2005-05-17 14:41 UTC (History)
CC List:	3 users (show)

Attachment	Type	Modified	Status	Actions	Submitter/CLA Status
Method and rules to check EnvelopeFrom and source IP for Accreditor	patch			None	Carl S. Gutekunst
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Theo Van Dinter 2004-11-24 19:52:25 UTC

The folks at Habeas and I have been in discussion wrt the new way that they want to do things.  We'd 
like to get the code in for 3.0.2.  Right now, I'm waiting to see their patch and for the CCLA to be 
received.

Creating the ticket as a tracker.

Comment 1 Carl S. Gutekunst 2004-12-05 21:11:27 UTC

Created attachment 2549 [details]
Method and rules to check EnvelopeFrom and source IP for Accreditor

This patch adds support for Habeas accreditation. This mechanism uses a flag in
the EnvelopeFrom that triggers a DNS whitelist query. The check of the Envelope
From is only an optimization at this point; it allows SpamAssassin to avoid
making an unnecessary DNS query. In the future this flag will be use to select
among multiple accreditation services.

The new DNS whitelist zone, accredit.habeas.com, returns a code in the final
octet that indicates the "accreditation level" for the sender. Legal values
range from 1 to 99, although for the near term Habeas will only be using the
discrete values 10, 20, 30, 40, and 50. Explanations of these are in the
comments added to the rules, below.

The patch does the following:

   * Add a new function that checks for a "reputation tag" in the
     Envelope From address of the form "a--accreditor", and
     extracts the accreditor. Note that this function is not specific
     to Habeas.

   * Add a new Evaluation Test that looks up the sender's address in a
     DNSWL if a specified "accreditor" is present in the Envelope
     From address. This function is also not specific to Habeas.

   * Added code to disable the test for the Habeas Haiku if an
     accreditor is present. This prevents folks from sending both the
     Haiku and the accreditor, and getting a bonus of -16.

   * Define three new rulesets that use the above functions to query
     the DNSWL sa-accredit.habeas.com if "a--habeas" is in the
     Envelope From. The ruleset defines the accreditor and the DNSWL to
     query.

   * Add the -firsttrusted to the existing HABEAS_USER rule. This
     should have been set all along; that was an oversight on our part.

   * Add more comments to Habeas additions in general.

Comment 2 Carl S. Gutekunst 2004-12-05 21:38:40 UTC

The new Habeas DNS whitelist zone, accredit.habeas.com, has the following names
defined that can be used for testing:

2.0.0.127.accredit.habeas.com has address 127.0.0.255
240.42.185.208.accredit.habeas.com has address 127.0.0.0
241.42.185.208.accredit.habeas.com has address 127.0.0.10
242.42.185.208.accredit.habeas.com has address 127.0.0.20
243.42.185.208.accredit.habeas.com has address 127.0.0.30
244.42.185.208.accredit.habeas.com has address 127.0.0.40
245.42.185.208.accredit.habeas.com has address 127.0.0.50
246.42.185.208.accredit.habeas.com has address 127.0.0.60

Comment 3 Theo Van Dinter 2004-12-15 22:13:33 UTC

this should be in the 3.0 milestone path.

Comment 4 Daniel Quinlan 2005-01-08 03:38:06 UTC

-1 the negative bonuses are excessive and inconsistent with our current levels

As I understand it, our current levels are:

  -16 for closed-loop opt-in
  -8 for opt-in

Implied in those scores for Habeas and Bonded Sender, are the extensive
application and verification process needed to get into these programs.
If current Habeas listings don't require any verification, then there's
already a discreprancy in the current scores.

If I read this patch correctly, Habeas listings would now often yield:

  -8.2 for opt-in with no checking
  -12 for opt-in with current level of checking
  -16 for double opt-in with current level of checking

Most of all, -16 is excessive, even for confirmed opt-in.  The highest scoring
ham in our entire corpus is only a 12.  A negative bonus of -8 should be enough
for anything, maybe -10.  More than -8 or -10 is just making it just way too
juicy to compromise a machine.

In addition, I don't get why the other two scores are increasing in magnitude
in this patch.  If there's no accredidation of the current batch of senders,
then the current scores are way too low.

Finally, we're going to start getting crazy with bonuses, maybe we should add
ones for independent third-party verification, donation of profits to
non-profits, etc.  My point: I don't think we want SpamAssassin to be front-
line for the marketing departments of these programs.

Comment 5 Daniel Quinlan 2005-01-08 03:46:03 UTC

Actually, after looking a bit more, the current levels don't really seem to be
following any convention at all.  Bonded Sender has a -4.3 which is a holdover
from a previous run of the GA which limited negative scores to -4.3.  The -8
for Habeas was hand-set at some point.

Comment 6 Carl S. Gutekunst 2005-01-08 18:23:48 UTC

Dan, the new Habeas levels are:

-8.0 For double opt-in, same as currently in 3.0.2. Note that the code checks
     for either the old Haiku or the new bounce address flag, but not both.

-4.0 For opt-in.

-0.2 For automated testing but no formal review of their processes.

If someone was both a Habeas customer and a Bonded Sender customer, they could
conceivably get to -12.3, but I don't see where they'd get to -16.

Comment 7 Daniel Quinlan 2005-01-09 21:30:26 UTC

Hmmmm... okay, then, I revoke my -1.  It might be clearer and cleaner to have
a single entry point (eval test) for all of the Habeas rules.

Can you please describe the automated testing a bit?  I just want to know that
we're not giving out bonus points for something easily forged.

Comment 8 Daniel Quinlan 2005-01-09 21:32:52 UTC

Carl,

Can you submit an individual CLA so we can accept this patch?  Note: we have
received the corporate CLA for Habeas, so we just need CLAs for any individuals
who will be contributing code.

Comment 9 Carl S. Gutekunst 2005-01-10 15:51:54 UTC

> Hmmmm... okay, then, I revoke my -1.  It might be clearer and cleaner to have
> a single entry point (eval test) for all of the Habeas rules.

Agreed, but I coded it the way I did for four reasons. First, I was trying to
make as few changes to the base code as possible. Second, I wanted to be able to
write separate rules for the old Habeas style (Haiku + HUL) and the new (return
path + ACCREDIT); the easiest way to do that without breaking the existing rules
was to make separate entry points. Third, we'd like to open up the return path
encoding to other accreditation authorities, not just Habeas. Finally, I wanted
to make it obvious where to delete the old Haiku-based code when it comes to
that, again without breaking other existing rules.

I commented the code pretty carefully so newcomers would (I hoped!) understand
what was going on. 

> Can you please describe the automated testing a bit?  I just want to know
> that we're not giving out bonus points for something easily forged.

With SpamAssassin 3.0, all Habeas senders are verified by DNSWL lookup. There's
no opportunity for forging anything, either in the envelope or header. The
difference between the different Accreditation Levels is the nature of how
senders are added or remove from our whitelist, not how we authenticate them.

Comment 10 Justin Mason 2005-01-12 19:11:49 UTC

+1

looks good to me.  I actually prefer the different entry points, as it does look
like that'd keep down the patch size ;)

Comment 11 Theo Van Dinter 2005-01-21 13:17:05 UTC

+1

looks good to me. :)

Comment 12 Theo Van Dinter 2005-01-21 13:24:02 UTC

committed to 3.1/trunk, btw.

Comment 13 Daryl C. W. O'Shea 2005-03-28 13:35:31 UTC

Does anyone else think that this should be moved into a plugin?

Comment 14 Michael Parker 2005-03-28 14:08:21 UTC

Subject: Re:  Add support for Habeas v2

> Does anyone else think that this should be moved into a plugin?

+1

Michael

Comment 15 Theo Van Dinter 2005-03-28 14:16:49 UTC

Subject: Re:  Add support for Habeas v2

On Mon, Mar 28, 2005 at 02:08:21PM -0800, bugzilla-daemon@bugzilla.spamassassin.org wrote:
> > Does anyone else think that this should be moved into a plugin?
> +1

For 3.0 I'm -0.5, for 3.1 I'm 0.0.

I'd rather not worry about moving code around for 3.0, but for 3.1 I
have no strong opinion either way.

Comment 16 Michael Parker 2005-03-28 14:19:36 UTC

Subject: Re:  Add support for Habeas v2

> 
> For 3.0 I'm -0.5, for 3.1 I'm 0.0.
> 
> I'd rather not worry about moving code around for 3.0, but for 3.1 I
> have no strong opinion either way.
> 

I should clarify, my +1 is for 3.1, no extra work for 3.0.

Michael

Comment 17 Daryl C. W. O'Shea 2005-03-28 14:22:49 UTC

Subject: Re:  Add support for Habeas v2

> I should clarify, my +1 is for 3.1, no extra work for 3.0.
> 
> Michael

Same here.  Let's just get it in 3.0 as is.

Daryl

Comment 18 Carl S. Gutekunst 2005-03-28 14:35:02 UTC

This is a free DNSWL check that could be implemented just as a new rule, with no
code at all. The reason I made code changes here (and in the companion patch,
Bug# 4182) is to improve SpamAssassin performance: the DNSWL check is done only
if the message contains a sentinal requesting the whitelist lookup.

Note too that, although only Habeas is using this mechanism at this time, the
mechanism is completely open for anyone to use (unlike the old Habeas Haiku,
which was restricted).

So I'd like to request it (and Bug# 4182) remain in EvalTests.pm.

<csg>

Comment 19 Daryl C. W. O'Shea 2005-03-28 15:17:09 UTC

Subject: Re:  Add support for Habeas v2

I don't really see how a DNSWL is much different from a DNSBL which we 
currently implement via a plugin (URIDNSBL).

A Habeas, or Accreditor, plugin would still be included, and enabled, 
with the distribution just like URIDNSBL, SPF, SpamCop, etc.


Daryl

Comment 20 Carl S. Gutekunst 2005-03-28 16:09:45 UTC

> I don't really see how a DNSWL is much different from a DNSBL which we 
> currently implement via a plugin (URIDNSBL).

The URIDNSBL plugin contains its own logic for scanning the body and a
dispatcher for multiple concurrent DNS requests. It's quite a different animal
from a wrapper that examines two header fields and then calls
check_rbl_backend(). For that matter, is it even possible for a plugin to call
check_rbl_backend()?

> A Habeas, or Accreditor, plugin would still be included, and enabled, 
> with the distribution just like URIDNSBL, SPF, SpamCop, etc.

SPF is a plugin because it includes a heavy library and needs to issue its own
(possibly unbounded) set of resolver requests. The SpamCop plugin is for
*sending* complaint mail; checking the SpamCop DNSBL is a simple rule that
invokes check_rbl() in EvalTests.pm.

I think the Habeas Accreditor code is more comparable to checking Sender Base,
which is in Dns.pm.

There's another issue, which I should have pointed out in my first reply: In the
presence of an accreditor, I need to disable the old Habeas "haiku" code. If I
didn't, senders who are in transition between the "haiku" and the Accreditor
style would get a -16 bonus. I don't think that's possible from a plugin.

Comment 21 Daryl C. W. O'Shea 2005-03-28 21:56:05 UTC

Subject: Re:  Add support for Habeas v2

>>I don't really see how a DNSWL is much different from a DNSBL which we 
>>currently implement via a plugin (URIDNSBL).
> 
> The URIDNSBL plugin contains its own logic for scanning the body and a
> dispatcher for multiple concurrent DNS requests. It's quite a different animal
> from a wrapper that examines two header fields and then calls
> check_rbl_backend(). For that matter, is it even possible for a plugin to call
> check_rbl_backend()?

Without looking at the code, I assume there wouldn't be a problem 
calling it.  I don't believe it's a public API, but the plugin would be 
part of the distribution so we don't have to worry about it breaking.


>>A Habeas, or Accreditor, plugin would still be included, and enabled, 
>>with the distribution just like URIDNSBL, SPF, SpamCop, etc.
> 
> SPF is a plugin because it includes a heavy library and needs to issue its own
> (possibly unbounded) set of resolver requests. The SpamCop plugin is for
> *sending* complaint mail; checking the SpamCop DNSBL is a simple rule that
> invokes check_rbl() in EvalTests.pm.

Yeah, they weren't great examples -- implementation wise.  I'm actually 
thinking that any rules/evals that depend on third parties should be in 
plugins.  I know there was talk about thinning out EvalTests.pm...


> I think the Habeas Accreditor code is more comparable to checking Sender Base,
> which is in Dns.pm.

Well, Dns.pm needs to be worked on too (either Net::DNS needs to support 
single socket background queries or we need to implement it ourselves -- 
I've got some code to do it but haven't tested it across different 
platforms).

I'm thinking that Sender Base should also be in some sort of Accreditor 
plugin too.  Maybe I'm alone on that.


> There's another issue, which I should have pointed out in my first reply: In the
> presence of an accreditor, I need to disable the old Habeas "haiku" code. If I
> didn't, senders who are in transition between the "haiku" and the Accreditor
> style would get a -16 bonus. I don't think that's possible from a plugin.

If the 'new Habeas' code was going to be moved to a plugin I'd think 
that the 'old Habeas' code would be moved to the same plugin, so that 
wouldn't be an issue.

Of course, even if the code was kept separate, we could always use a 
meta rule for the old code (with a !NEW_HABEAS conditional) and take the 
penalty of needlessly running the test.  I wouldn't recommend it, but 
it's one way.


Daryl

Comment 22 Justin Mason 2005-04-09 17:51:58 UTC

actually, there's another problem with pluginizing this code; currently, the
internal APIs for check_rbl() and check_rbl_backend() are not well enough
defined and documented for plugin use, iirc.  consider "check_rbl_backend" as a
method name -- even that needs a better, more descriptive public API name, we
shouldn't even be calling them "RBLs"! ;)  fixing this would be advisable before
pluginizing code that calls it.

other than that, I'd be +1 to pluginizing for 3.1.0 if anyone wants to do that
stuff.   Carl, it's not a change in "supportedness" of Habeas in any way -- it's
purely a code cleanliness issue, and would leave the habeas code and lookups
part of the core (just now part of the core set of default plugins).

Comment 23 Theo Van Dinter 2005-04-26 14:18:33 UTC

ok, since there was apparently already a vote and everything (the discussion
went off in a plugin tangent which is a different issue,) applied.  r164885

Comment 24 Michael Parker 2005-04-26 14:35:19 UTC

It's not clear that there were 3 +1s on this code, unless we count Daryl's
"Let's just get it in 3.0 as is." as a vote.

As mentioned, I'm -1 on any new enhancements in the 3.0 tree, especially with
3.1 seemingly around the corner.

Comment 25 Theo Van Dinter 2005-04-26 14:53:46 UTC

On Tue, Apr 26, 2005 at 02:35:20PM -0700,
bugzilla-daemon@bugzilla.spamassassin.org wrote:
> It's not clear that there were 3 +1s on this code, unless we count Daryl's
> "Let's just get it in 3.0 as is." as a vote.

Did I miscount?  I thought I saw 3 +1's in there, but I may have misread one 
of the plugin votes as a patch vote.  Hrm.  Ok, you're right, there seem only
to be 2 specifically for the patch.  <sigh>

Daryl can you clarify your statement in comment 17?

I'll revert the commit for now.

> As mentioned, I'm -1 on any new enhancements in the 3.0 tree, especially with
> 3.1 seemingly around the corner.

I keep hearing that, but "around the corner" is a minimum of 1 month from when
we actually do decide to do the release.  Also, this patch was supposed to be up
for 3.0.2 but due to various issues wasn't included.

Comment 26 Daryl C. W. O'Shea 2005-04-26 18:45:23 UTC

Subject: Re:  Add support for Habeas v2

Well, I had only meant that I didn't want the code re-written for 3.0. 
I wasn't voting for 3.0 in either way.

I really don't know what to say about the whole time line... maybe if 
it's left out of 3.0 we can get a 3.1 RC out real soon?


On a related note, I'd like to see 3.1 bugs up for review reviewed soon 
so that we don't run in to this again in (hopefully) a few weeks.

Comment 27 Michael Parker 2005-04-27 13:58:41 UTC

Move to 3.0.4 milestone.

Comment 28 Theo Van Dinter 2005-05-17 22:41:45 UTC

ok, I think we've come to the general conclusion that this code will have to wait for 3.1.  so closing the 
ticket.