|
SA Bugzilla – Full Text Bug Listing |
Summary: | internal_networks support | ||
---|---|---|---|
Product: | Spamassassin | Reporter: | Michel Bouissou <michel> |
Component: | Rules | Assignee: | SpamAssassin Developer Mailing List <dev> |
Status: | RESOLVED FIXED | ||
Severity: | normal | ||
Priority: | P1 | ||
Version: | 2.60 | ||
Target Milestone: | 2.70 | ||
Hardware: | Other | ||
OS: | Linux | ||
Whiteboard: | |||
Attachments: |
internal_networks support
Test message causing DYNABLOCK FP Test output with debug bugfix on top of patch 1563 |
Description
Michel Bouissou
2003-11-09 11:58:52 UTC
Agh. You're right. I think the solution is to modify the -lastuntrusted lookup behaviour to include trusted relays; so if a mail is relayed through *any* machine -- whether trusted or untrusted -- that's OK and it won't hit dynablock. (currently, it only checks for a relay through an untrusted server, hence this bug.) It's perfectly reasonable to trust the other ISP's relays, so just having those in the trusted list should not cause Dynablock FPs. changing milestone -- easy to fix if we all agree Justin Mason wrote:
> I think the solution is to modify the -lastuntrusted lookup behaviour
> to include trusted relays; so if a mail is relayed through *any* machine
> -- whether trusted or untrusted -- that's OK and it won't hit dynablock.
Doing this would miss dynablock checks for all direct-to-MX spam sent thru
one's secondary MX (as the secondary MX will relay to the primary, the dialup
that sent direct-to-MX spam to the secondary MX wouldn't be
dynablock-checked). This would be bad because a *lot* of direct-to-MX spam is
sent thru secondary MXes (maybe spammer thinks secondary MXes are less
protected ?).
But the "trusted_network" system is much useful, as one often receives most of
his legitimate mail from the 4-5 biggest ISPs in his country. So "trusting"
them saves a lot of useless DNSBL queries, and saves much useless load onto
the DNSBL servers as well.
So I believe we need a supplementary option "my_mxes" separate from the
trusted_networks. Machines on trusted_networks wouldn't be DNSBL checked, but
only the machines that relay to one of "my_mxes" AND are not "trusted" would
then be dynablock-checked.
If "my_mxes" is not set, machines that relay to the last SMTP server should be
checked (if not in the "trusted" list).
Comments ?
This sounds complicated... perhaps should wait until 2.62? I think Michel is right -- a supplementary version of 'trusted_networks' specifically for networks you *run*. ie.: trusted_networks = networks you trust will not send spam, you trust they will report the correct rDNS data my_mxes = the "external edge" of your network, where it meets the net trusted_networks is "milder" than my_mxes; but my_mxes is required to support Dynablock tests. BTW I think I prefer "internal_networks" as a be a better name than "my_mxes". ;) Created attachment 1563 [details]
internal_networks support
OK, here's the patch to add 'internal_networks' as discussed.
it's a bit of a big change, unfortunately:
Conf.pm | 43 ++++++++++++++++++++++++--
EvalTests.pm | 97
+++++++++++++++++++++++++++++++++++++++--------------------
Received.pm | 58 ++++++++++++++++++++++++++++++++---
3 files changed, 159 insertions(+), 39 deletions(-)
Here's some log of me explaining it to Dan:
(16:15:30) justinmason23: one thing though, is the Dynablock/trusted_networks
problem
(16:16:13) justinmason23: I think the only way to fix that properly is to add
an 'internal_networks' analog to 'trusted_networks', make it very conservative
(no inference), and use that for Dynablock checks
(16:17:28) justinmason23: because the trusted-network stuff combined with
Dynablock is causing big FPs when it makes the wrong guesses
(16:17:28) Dan: so, dialup checks would only activate if internal_networks is
set?
(16:17:48) justinmason23: no, it'd use dialup checks -- but it'd never consider
any host internal past the host the scanner is running on
(16:18:22) justinmason23: so users who have a network where SA is not running
on an external MX, will miss the dynablock hits
(16:18:30) justinmason23: unless they set 'internal_networks' correctly
(16:19:39) Dan: how is that different than trusted_networks ?
(16:19:48) justinmason23: trusted_networks does inference
(16:20:01) justinmason23: also it's used to trim out hosts from the list of DNS
checks
(16:20:18) justinmason23: people are setting it to include their friends' ISPs
as well, which is throwing off Dynablock
(16:21:01) justinmason23: basically, TN is being used for meaning *both* "my
networks" and "hosts I trust not to forge mail"
(16:21:12) justinmason23: this double meaning is the problem
(16:22:07) Dan: ah, extending too far
(16:22:19) justinmason23: yeah
(16:22:26) justinmason23: the use of it for Dynablock checks is what causes the
problem
(16:22:30) Dan: I misunderstood because I was thinking of the OTHER dialup
problem.
(16:22:39) Dan: you also have to address internal dialup hosts.
(16:23:02) justinmason23: there's an OK workaround for that though
(16:23:15) justinmason23: either (a) the host doesn't scan mail from their own
dialup networks
(16:23:28) Dan: I think the solution is to only do dynamic/dialup based on
internal networks... if unset, do nada... no inference
(16:24:15) justinmason23: or (b) the admin configures the dialup networks as
being part of internal_networks and SA handles not running the dynablock check
(16:24:28) justinmason23: yep, that's what I'm suggesting
(16:24:48) Dan: no inference and don't even run them if internal_networks is
unset
(16:25:11) justinmason23: hmm
(16:25:56) justinmason23: Well, if we did that then most people would lose the
dynablock hits, even though it's only a very small number of people who are
running into the "dynablock hitting our own dialup pool" problem
(16:26:35) Dan: I think the score is so high we have to do it.
(16:27:19) justinmason23: I think we could use "just this host" for the default
internal_networks setting for most people, then just the people who *have*
their own dialup pools have to set it for that problem case
(16:28:02) Dan: that would be ok, I guess
(16:28:58) justinmason23: that would mean that most people wouldn't lose their
hit rate
-0 It's a pretty big change in a point release, and would come close to justifying a GA run (I agree that would be unecessary, but not by all that much). If it's not buggy, go for it, but at this point, it's not really tested, and it's a configuration file change in a point release. I'll test this one and let you know. [root@totor 5.6.1]# patch -p1 < /home/michel/A_Installer/SpamAssassin/2.60_patches/trusted_networks.patch patching file Mail/SpamAssassin/Conf.pm Hunk #1 succeeded at 223 (offset -2 lines). Hunk #2 succeeded at 468 (offset -2 lines). Hunk #3 succeeded at 1352 (offset -3 lines). Hunk #4 succeeded at 1396 (offset -3 lines). patching file Mail/SpamAssassin/EvalTests.pm Hunk #1 succeeded at 696 (offset -4 lines). Hunk #2 succeeded at 919 (offset -74 lines). Hunk #3 succeeded at 1149 (offset -74 lines). Hunk #4 succeeded at 1189 (offset -74 lines). Hunk #5 succeeded at 1212 (offset -74 lines). Hunk #6 succeeded at 1299 (offset -74 lines). patching file Mail/SpamAssassin/Received.pm Hunk #2 succeeded at 84 (offset -13 lines). Hunk #3 succeeded at 221 (offset -29 lines). Hunk #4 succeeded at 521 (offset -87 lines). Hunk #5 succeeded at 919 (offset -100 lines). Should my own MXes be in both internal_networks and trusted_networks, or only in trusted_networks ? they should be in both -- trusted_networks should contain internal_networks. the idea is that internal_networks is *just* your nets and MXes; trusted can contain other people's mailservers and networks (if you trust them not to originate spam or be running a subverted machine). Yes, but I believe that one usually "trusts" his "internal_networks". So putting your own machines both in "trusted_networks" and "internal_networks" is allright, but maybe a bit of unnecessary hassle. Putting your own machines only in "internal_networks" should be enough, and the system should determine that "internal" machines are automatically "trusted" as well. (The tests that currently check for "trusted" should check for "trusted or internal", that would do the trick) It would just make configuration easier... The patch doesn't seem to work. I just got another DYNABLOCK FP: X-Spam-RBL-Results: <dns:32.90.8.80.dynablock.easynet.nl?type=TXT> ["Dynamic/Residential IP range listed by easynet.nl DynaBlock - http://dynablock.easynet.nl/errors.html"] X-Spam-Trusted-Relays: [ ip=80.67.174.41 rdns=samizdat.net helo=slut.samizdat.net by=totor.bouissou.net ident= intl=0 ] [ ip=193.252.22.28 rdns=smtp3.wanadoo.fr helo=mwinf0304.wanadoo.fr by=slut.samizdat.net ident= intl=0 ] X-Spam-DCC: Etherboy: totor.bouissou.net 1002; Body=1 Fuz1=1 Fuz2=1 X-Spam-Status: No, hits=-103.4 required=5.0 tests=AWL,BAYES_00, RCVD_IN_DYNABLOCK,USER_IN_WHITELIST autolearn=no version=2.60 X-Spam-Checker-Version: SpamAssassin 2.60 (1.212-2003-09-23-exp) on totor.bouissou.net X-Spam-Possible-Lang: fr X-Spam-Level: X-Spam-Untrusted-Relays: [ ip=80.8.90.32 rdns=ca-bordeaux-18-32.w80-8.abo.wanadoo.fr helo=something by=mwinf0304.wanadoo.fr ident= intl=0 ] Another DYNABLOCK FP, and another issue as well: It seems that the last relay that sent the mail to my MX is ignored by SA, as it doesn't appear at all in SA relays list. Possibly because this machine gave an IDENT information that, formatted by the receiving qmail, cause SA could not understand the IP format...? Let's see: In local.cf, I have: trusted_networks 213.228.0/24 The e-mail Received: headers are: Received: from postfix3-2.free.fr (foobar@213.228.0.169) by totor.bouissou.net with SMTP; 14 Nov 2003 08:05:50 -0000 Received: from asterix.laurier.org (lns-p19-8-82-65-66-244.adsl.proxad.net [82.65.66.244]) by postfix3-2.free.fr (Postfix) with ESMTP id 7BACDC372 for <michel@bouissou.net>; Fri, 14 Nov 2003 09:05:49 +0100 (CET) SA diags are: X-Spam-RBL-Results: <dns:244.66.65.82.dynablock.easynet.nl?type=TXT> ["Dynamic/Residential IP range listed by easynet.nl DynaBlock - http://dynablock.easynet.nl/errors.html"] X-Spam-Trusted-Relays: X-Spam-DCC: Etherboy: totor.bouissou.net 1002; Body=1 Fuz1=1 Fuz2=1 X-Spam-Status: No, hits=-103.2 required=5.0 tests=AWL,BAYES_00, RCVD_IN_DYNABLOCK,USER_IN_WHITELIST autolearn=no version=2.60 X-Spam-Checker-Version: SpamAssassin 2.60 (1.212-2003-09-23-exp) on totor.bouissou.net X-Spam-Possible-Lang: fr X-Spam-Level: X-Spam-Untrusted-Relays: [ ip=82.65.66.244 rdns=lns-p19-8-82-65-66-244.adsl.proxad.net helo=asterix.laurier.org by=postfix3-2.free.fr ident= intl=0 ] As you can see, the "postfix3-2.free.fr (foobar@213.228.0.169)" relay, that should appear in the "trusted relays" section, doesn't show up. SA has missed it. (I had already seen this "missed relay" issue before applying today's patch, so it's not the patch that caused this) Please note that the "Received: from postfix3-2.free.fr (foobar@213.228.0.169)" format is the usual Received: line format used by qmail. "foobar@" appears when the remote SMTP server answered and "IDENT" request giving "foobar" as username. qmail Received format is: Received: from (<remote_machine_rDNS>|unknown) \((<IDENT_username@>)?<IP_address>\) by <receiving_hostname> with <protocol>; <timestamp> Sorry, I was incomplete qmail Received format is: Received: from (<remote_machine_rDNS>|unknown) (\(HELO <helo_name_given_by_remote_if_different_from_rDNS>\))? \((<IDENT_username@>)?<IP_address>\) by <receiving_hostname> with <protocol>; <timestamp> This can make for example: Received: from unknown (HELO feux01a-isp) (213.199.4.210) by totor.bouissou.net with SMTP; 1 Nov 2003 07:05:19 -0000 or Received: from x1-6-00-04-bd-d2-e0-a3.k317.webspeed.dk (benelli@80.167.158.170) by totor.bouissou.net with SMTP; 5 Nov 2003 23:18:42 -0000 or Received: from adsl-207-213-27-129.dsl.lsan03.pacbell.net (HELO merlin.net.au) (Owner50@207.213.27.129) by totor.bouissou.net with SMTP; 10 Nov 2003 06:30:34 -0000 -0 I have to agree with duncf, it's too much for a point release. Michel -- that Received header issue is best dealt with under a separate bug -- here it is: http://bugzilla.spamassassin.org/show_bug.cgi?id=2759 Folks, I think I agree the internal_networks thing is best left for 2.70. Although I am worried about sites that'll see FPs heavily from Dynablock without it. Maybe we should *ignore* trusted_networks when trying to use dynablock; always use the entire Received chain for that rule. Justin Mason wrote:
> Folks, I think I agree the internal_networks thing is best left for 2.70.
> Although I am worried about sites that'll see FPs heavily from Dynablock
> without it.
Too bad, the patch for internal_networks being already written...
It already correctly identifies internal servers (I see intl=0 or intl=1
display correctly in mail headers).
There probably remains a little bug in the "which machine to DYNABLOCK test"
routine, but that should be easier to fix than taking another direction now ?
Anyway, I can't use the "trusted_networks" system as long as this is not
fixed, as I get 100% DYNABLOCK FPs for all the customers of big ISPs that I
would "trust"...
I have taken a look at Justin's patch, trying to undestand why I still get FPs on DYNABLOCK checks, even with the new "internal_networks" system. I must admit I didn't completely understand the current underlying logic, at it seems rather complex, and I'm not very familar with Perl. "dialup RBLs" tests are useful only for catching direct-to-MX spam, and what we can catch with this is: - Spam sent by a dialup/dynamic IP address which is OUT of our own network or ISP's dynamic IP pool, directly to our own network's or ISP's MX. I believe that the logic for testing "dialup RBLs" should be the following. We should check an IP against a "dialup RBL" only if: - The "Received: ... by" (receiving machine) is either in our "internal_networks", OR is the last "Received:" (in case we haven't set internal_networks, or have forgotten one of our MXes in it, we can always be sure that the last Received: is by one of our "own legitimate" MXes, which is a safe inference) => Because these checks make sense only at the edge of our network. - The "Received: from" (sending machine) is: - external (not in internal_networks) - AND untrusted (not in trusted_networks) - AND its rDNS isn't in the same domain as the "Received: ... by" - AND its IP is not in the same /16 as the "Received: ... by" (These last 2 tests for people that will set their own ISP's MXes as "internal". They probably won't know / won't set the complete dialup pools of their own ISP as "internal" or "trusted", but we wouldn't want legitimate mail sent by the same ISP's customer thru its MX to trigger a DYNABLOCK test. In such a case, customer of ISPs that don't properly define rDNS for their own dynamic IP pools may trigger DYNABLOCK tests, but we can't do much for ISPs which network is wrotten. That's their own problem to define rDNS properly.) Did I miss something in there ? no, that's pretty much correct. It's complicated a little because the logic is split between 2 functions: 1. Received.pm, the header parser and trust-determining part 2. EvalTests.pm, which picks what IPs to check from the "untrusted" set To deal with the 'IP is not in the same /16 as the "Received: ... by"' test, the latter would have to be extended to check the relay's IP against the IP of the next relay in the list. (that's "next" in terms of time, not list order.) I don't think checking the rDNS is worthwhile though as that is spoofable. no longer in [review] state So could you find out why you proposed patch for internal_networks doesn't work as expected ? Do you think it would be hard to fix given the current status of this patch ? I'd love to get it working, even if the patch isn't part of the official distribution for now... Michel, could you attach a message that FPs with the internal_networks patch, and a debug log from "spamassassin -D -t < message"? Yes, I will attach: - A test message (removed real contents from it, but it's enough for FP'ing). - Test output. I made the test with ./test-received < test-rcvd6.txt > test-output.txt Where test-received is: #! /bin/bash cat | spamassassin -D received=255 2>&1 | egrep -v "^(debug: (tokenize:|bayes|(Current |Final )?PATH|[Rr]unning in|using \"|Score set)| Razor-Log:|nov [0-9][0-9] [0-9][0-9])" | egrep -v "(Razor2|helper-app|dccproc|[Ll]anguage)" (just for keeping the output short and only get what we need). In this example, my local.cf says "trusted_networks 213.228.0/24" So this is correct: X-Spam-Trusted-Relays: [ ip=213.228.0.169 rdns=postfix3-2.free.fr helo= by=totor.bouissou.net ident=foobar intl=0 ] And this is correct as well: X-Spam-Untrusted-Relays: [ ip=82.65.66.244 rdns=lns-p19-8-82-65-66-244.adsl.proxad.net helo=asterix.laurier.org by=postfix3-2.free.fr ident= intl=0 ] But what is not correct is that 82.65.66.244 got DYNABLOCK-Checked where it was relaying to 213.228.0.169 (which is not internal), resulting in: X-Spam-RBL-Results: <dns:244.66.65.82.dynablock.easynet.nl?type=TXT> ["Dynamic/Residential IP range listed by easynet.nl DynaBlock - http://dynablock.easynet.nl/errors.html"] X-Spam-Status: No, hits=-2.3 required=5.0 tests=BAYES_00,RCVD_IN_DYNABLOCK autolearn=no version=2.60 Created attachment 1578 [details]
Test message causing DYNABLOCK FP
Created attachment 1579 [details]
Test output with debug
This gives both debug output, and contents of the message once processed
Created attachment 1580 [details]
bugfix on top of patch 1563
ok, Michel, please try this one.
BTW that applies on top of patch 1563 -- it's cumulative. Looks better at first sight. At least it doesn't give a DYNABLOCK FP on the test messsage I sent you this morning anymore. I'll keep it running like this, and check coherence as I receive new mail. Thanks :-) I've tested the "corrected patch" on a variety of received mail, sent directly to my primary MX or thru my secondary, with messages that must hit "DYNABLOCK" and others that must not. It seems that everything now works plain good in the internal_networks / trusted_networks system. So I think this patch should be integrated into SA. +1 :-) Subject: Re: [SAdev] trusted_networks don't behave as expected
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
>So I think this patch should be integrated into SA.
Excellent news!
All that remains now is determining if the devs want to wait for
2.70, since it is adding a new conf item. However, I think it
should go into 2.61, since:
1. it fixes a widespread, FP-causing issue with DYNABLOCK. This
is causing a *lot* of FPs.
2. it is backwards compatible; users who haven't set trusted_networks or
have set that but not set internal_networks will get sane defaults.
3. we've added conf items to point releases in the past, specifically
the bayes expiry ones for 2.5x.
- --j.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)
Comment: Exmh CVS
iD8DBQE/vlqcQTcbUG5Y7woRAmigAJ9mImwJlOTc2Th4L+5+9kHkAOKzzgCguSJO
lb0JVHDH1rkby7lpakOgYlA=
=wKMY
-----END PGP SIGNATURE-----
Justin Mason wrote: > However, I think it should go into 2.61, since: > > 1. it fixes a widespread, FP-causing issue with DYNABLOCK. This > is causing a *lot* of FPs. ...and DYNABLOCK scores quite high... > 2. it is backwards compatible; users who haven't set trusted_networks or > have set that but not set internal_networks will get sane defaults. > > 3. we've added conf items to point releases in the past [...] I completely share your opinion about this. This fix fixes a serious bug and can cause no harm, so it should be integrated asap. There's an issue however which I couldn't test and for which I don't know how the system will behave: People that might set their own ISP's SMTP servers as "internal" and "trusted" (people getting mail from their ISP using fetchmail for example). In this case there might possibly remain a risk that users of the same ISP sending mail from the ISP's dialup pool directly to the ISP's MX, could still get DYNABLOCK FP's. I'm not sure about this, but it could be, if there is no comparison done between the domain of the sender machine (from rDNS) and the domain of the receiving MX ? I don't really know how to fix this, besides: - Not putting your own ISP's server in the "internal networks" - Or adding their complete dialup pools as "internal" as well ? - Or adding domain comparison between sender machine and receiving MX ? Cheers. Subject: Re: [SAdev] trusted_networks don't behave as expected
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
>There's an issue however which I couldn't test and for which I don't know how
>the system will behave: People that might set their own ISP's SMTP servers as
>"internal" and "trusted" (people getting mail from their ISP using fetchmail
>for example).
>
>In this case there might possibly remain a risk that users of the same ISP
>sending mail from the ISP's dialup pool directly to the ISP's MX, could still
>get DYNABLOCK FP's.
>
>I'm not sure about this, but it could be, if there is no comparison done
>between the domain of the sender machine (from rDNS) and the domain of the
>receiving MX ?
>
>I don't really know how to fix this, besides:
>- Not putting your own ISP's server in the "internal networks"
>- Or adding their complete dialup pools as "internal" as well ?
>- Or adding domain comparison between sender machine and receiving MX ?
I think the domain comparison will probably fail in a lot of cases
anyway; consider ISPs that have a different domain for their dialup
ranges and their mailservers. (some do, e.g. earthlink.net.)
IMO the best answer here is to add the complete dialup pools as internal.
- --j.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)
Comment: Exmh CVS
iD8DBQE/vmYEQTcbUG5Y7woRAtvIAJ9NBXw2dIihwJ6ik3JNH/53xtd0hgCgiN7e
MJ0tOOkwq3aFfabWzeLZ1Ak=
=urYw
-----END PGP SIGNATURE-----
Justin Mason wrote: > I think the domain comparison will probably fail in a lot of cases > anyway; Sure. But it might help avoiding a significant number of FPs for people who would configure their own ISP's MXes as "internal" without thinking of configuring their ISP's dialup pools as internal as well... > consider ISPs that have a different domain for their dialup > ranges and their mailservers. (some do, e.g. earthlink.net.) <kidding> Well, I receive tons of spam from their netblocks, I don't mind if they trigger any possible test ;-) </kidding> > IMO the best answer here is to add the complete dialup pools as internal. Sure, but I don't know many people who know the comprehensive list of their ISP's dialup pools (at least, I don't know mine and don't care much ;-) Maybe specifying "internal_networks" as IP ranges _or_ domain names could help. Of course, rDNS can sometimes be forged, but I don't believe many spammers make the effort of faking their rDNS record before sending spam to each and every domain they use to spam... Anyway, the current patch already gives a big improvement; I was just thiking of possible ways to improve it. Cheers. ok, I think the other devs don't want to see this happening in the 2.60 series, so I'll punt it to 2.70. However, people who run into the issue (ISPs, that is), can always apply the patches here in the meantime to their 2.6x installations. Too bad, where it works perfectly, fixes an FP-causing bug, and wouldn't harm any user upgrading, even if not paying attention to the new feature and not configuring internal_networks. I don't really understand the reason for postponing the release of an existing fix, even if this fix adds a new "feature" that cannot harm. But I've seen SA 2.61 is already released, so I guess it's a bit late for discussing this issue. I hope to see it in 2.62 rather than have to wait for 2.70 anyway. Cheers. Subject: Re: [SAdev] trusted_networks don't behave as expected On Tue, Dec 09, 2003 at 03:43:52AM -0800, bugzilla-daemon@bugzilla.spamassassin.org wrote: > I don't really understand the reason for postponing the release of an existing > fix, even if this fix adds a new "feature" that cannot harm. Ignoring this specific issue, maintenance releases are just that ... Adding features isn't maintenance, so it's best to wait for the next non-maintenance release. ok, this has been in 2.70 for a while. marking FIXED |