|
SA Bugzilla – Full Text Bug Listing |
Summary: | Speeding up lookups on {trusted,internal,msa}_networks | ||
---|---|---|---|
Product: | Spamassassin | Reporter: | Mark Martinec <Mark.Martinec> |
Component: | Libraries | Assignee: | SpamAssassin Developer Mailing List <dev> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | kmcgrail |
Priority: | P3 | ||
Version: | SVN Trunk (Latest Devel Version) | ||
Target Milestone: | 3.4.0 | ||
Hardware: | All | ||
OS: | All | ||
Whiteboard: | |||
Attachments: | implement NetSet Patricia trie lookups and caching |
Description
Mark Martinec
2010-11-04 13:27:09 UTC
Created attachment 4819 [details] implement NetSet Patricia trie lookups and caching trunk: Bug 6508: Speeding up lookups on {trusted,internal,msa}_networks Sending lib/Mail/SpamAssassin/NetSet.pm Sending lib/Mail/SpamAssassin/PerMsgStatus.pm Sending lib/Mail/SpamAssassin/Util/DependencyInfo.pm Committed revision 1031095. (In reply to comment #0) > Now, there is one caveat with using a Patricia trie for lookups: > it finds the tightest match on listed CIDR networks - unlike our > present sequential search, which returns the first match in the > order of declared networks. For all practical purposes on valid > data this makes no difference. The difference pops up on overlapping > conflicting networks. In my experience this has been an important feature. I know of a number of people that have setups like this: trusted_networks 10.20.30.0/24 internal_networks 10.20.30.0/24 !10.20.30.2/31 Where the /24 covers their MXes and the /31 covers their MSAs. I suppose that msa_networks solves this particular problem (I think) but I don't know what the uptake of msa_networks has been like... lots of people got things working long before I came up with msa_networks. I think I'd lean towards only using Net::Patricia if there are no "!" entires, or optionally. Daryl > In my experience this has been an important feature.
> I know of a number of people that have setups like this:
> trusted_networks 10.20.30.0/24
> internal_networks 10.20.30.0/24 !10.20.30.2/31
> Where the /24 covers their MXes and the /31 covers their MSAs.
The above produces a lint warning:
$ spamassassin --lint
warn: netset: cannot exclude 10.20.30.2/31 as it has already been included
Similarly, of the six failing test cases in trust_path.t all but one
produce a lint warning (if I counted correctly). The remaining one
is perhaps questionable or maybe a lint test can be added.
We have been telling people they cannot expect flawless operation
in presence of lint errors or warnings. Following a principle of
'garbage-in, garbage-out', maintaining backwards compatibility with
invalid setups is not always necessary or possible, especially
when warnings _are_ being issued.
But I agree the concept of these three sets of networks is hard
to understand and may not even be applicable to every topology,
so any hand-guiding in a form of documentation with examples
and warnings is more than welcome.
> We have been telling people they cannot expect flawless operation
> in presence of lint errors or warnings. Following a principle of
> 'garbage-in, garbage-out', maintaining backwards compatibility with
> invalid setups is not always necessary or possible, especially
> when warnings _are_ being issued.
Just to point out the obvious, and to ensure it stays that way...
We're talking Target Milestone 3.4.0.
With a target like that, backwards compatibility can be dropped, if documented and communicated properly in the release notes. If it actually introduces a noticeable advantage, even in edge-cases, *and* there are other methods to get equivalent behavior (msa_networks?).
(In reply to comment #3) > > In my experience this has been an important feature. > > I know of a number of people that have setups like this: > > trusted_networks 10.20.30.0/24 > > internal_networks 10.20.30.0/24 !10.20.30.2/31 > > Where the /24 covers their MXes and the /31 covers their MSAs. > The above produces a lint warning: > $ spamassassin --lint > warn: netset: cannot exclude 10.20.30.2/31 as it has already been included Crap, I forgot that smaller "conflicting" networks had to be declared first. I now remember arguing about why I wanted to do that too (so you didn't try to exclude something you already included or vice-versa). So, I think this may make it a non-issue, no? I think the smaller network will always be the accurate "answer". > Similarly, of the six failing test cases in trust_path.t all but one > produce a lint warning (if I counted correctly). The remaining one > is perhaps questionable or maybe a lint test can be added. It's quite possible that the tests are wrong. > We have been telling people they cannot expect flawless operation > in presence of lint errors or warnings. Following a principle of > 'garbage-in, garbage-out', maintaining backwards compatibility with > invalid setups is not always necessary or possible, especially > when warnings _are_ being issued. I just forgot about the ordering rule... if you get the warn from lint it won't actually add that part of the config (like it says). And with the smaller networks always listed first I think with Net::Patricia it'll DTRT. (In reply to comment #4) > Just to point out the obvious, and to ensure it stays that way... > We're talking Target Milestone 3.4.0. > With a target like that, backwards compatibility can be dropped I actually think, now, that it won't change any behaviour. To recap: the Net::Patricia support is done and works well, what is still needed is to polish the t/trust_path.t test which unnecessarily fails in the presence of Net::Patricia, or at least document the fact in 3.4.0 release notes. (In reply to comment #7) > To recap: the Net::Patricia support is done and works well, > what is still needed is to polish the t/trust_path.t test > which unnecessarily fails in the presence of Net::Patricia, > or at least document the fact in 3.4.0 release notes. After working on this for a few hours, I think there are some caveats with using Net::Patricia that are not well documented. However, it appears the cases that fail in trust_path.t are incompatibilities with the syntax for {trusted,internal,msa}_networks. I'm adding some code to avoid the checks if Net::Patricia is installed and some documentation warning regarding the syntax for Net::Patricia not being as flexible as NetAddr::IP with overlapping networks. I believe we can safely call it poorly configured for the admins technical enough to want to utilize Net::Patricia. If it becomes more prevalent though we might want to figure out a lint warning for all of the syntaxes that aren't supported. Committing and closing. And I can now do a make test on trunk with every single optional module installed. This also works with and without Net::Patricia though I'll test it on an older perl after the commit and report back if there is a problem. svn commit -m 'Net::Patricia not as flexible as NetAddr::IP with overlapping networks. Updated documentation to help make this more apparent and also fixed trust_path.t to avoid tests for incompatible syntaxes.' Sending lib/Mail/SpamAssassin/Conf.pm Sending lib/Mail/SpamAssassin/Util/DependencyInfo.pm Sending t/trust_path.t Transmitting file data ... Committed revision 1373194. Regards, KAM > This also works with and without Net::Patricia though I'll test it on an
> older perl after the commit and report back if there is a problem.
Also tested and passed on 5.8.6 on non-IPv6 system without Net::Patricia.
> After working on this for a few hours, I think there are some caveats with > using Net::Patricia that are not well documented. However, it appears the > cases that fail in trust_path.t are incompatibilities with the syntax for > {trusted,internal,msa}_networks. > > I'm adding some code to avoid the checks if Net::Patricia is installed and > some documentation warning regarding the syntax for Net::Patricia not being > as flexible as NetAddr::IP with overlapping networks. Documentation update: - document how Net::Patricia search differs from a sequential search - document/clarify IPv6 syntax in {trusted,internal,msa}_networks - remove the claim on flexibility trunk (3.4): Sending lib/Mail/SpamAssassin/Conf.pm Sending lib/Mail/SpamAssassin/Util/DependencyInfo.pm Sending t/trust_path.t Committed revision 1403592. |