Bug 6518 - Implement a general purpose Plugin::AskDNS, makes Spamhaus DWL possible
Summary: Implement a general purpose Plugin::AskDNS, makes Spamhaus DWL possible
Status: RESOLVED FIXED
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Plugins (show other bugs)
Version: SVN Trunk (Latest Devel Version)
Hardware: All All
: P2 enhancement
Target Milestone: 3.4.0
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 6499
  Show dependency tree
 
Reported: 2010-11-30 13:10 UTC by Mark Martinec
Modified: 2011-06-20 18:46 UTC (History)
0 users



Attachment Type Modified Status Actions Submitter/CLA Status
Implements the AskDNS plugin patch None Mark Martinec [HasCLA]

Note You need to log in before you can comment on or make changes to this bug.
Description Mark Martinec 2010-11-30 13:10:04 UTC
Adding a new plugin AskDNS, which can implement Spamhaus DWL lookups.
It is a general purpose plugin implementing DNS lookups, where a
queried domain may depend on data provided by other plugins through
the existing tags mechanism.

It understands rules of the form:

  askdns RULENAME domain [rr_type [subrule]]

where:

- domain is a template for assembling a full query domain name
  (not just its zone name). The template may include any number of
  tag names (none, one, or several), using the familiar _TAGNAME_
  syntax as used in add_header rules. Specifying a tagname implies
  dependencies: the assembled query will be launched immediately
  when all the required tags become available.
  A typical example is: _DKIMDOMAIN_._vouch.dwl.spamhaus.org

- rr_type is a DNS resource record type to be queried, typically
  a TXT or A, but may be other type known to Net::DNS, such as
  AAAA, PTR, NS, SPF, SRV, ...  Missing rr_type implies A.

- optional subrule can be a regular expression, or a form already
  familiar from a URIDNSBL plugin (IP address with or without a mask,
  or a range). Absence of subrule accepts any positive response.


askdns rules are grouped by their dependencies and their queries are
launched as soon as all dependencies are met. Dns queries are grouped
by RR type and an expanded query domain, so only a single DNS query
is launched for all askdns rules which happen to evaluate to the
same rr type / domain pair.

If a tag value is a space-separated list (e.g. multiple signing
domains in a _DKIMDOMAIN_, when a message has multiple valid DKIM
signatures), then multiple queries are launched, each with one
tag value. Walking through all possible values is general and
works even with multiple tags in a template.
Comment 1 Mark Martinec 2010-11-30 13:14:08 UTC
Created attachment 4826 [details]
Implements the AskDNS plugin

trunk:

Bug 6518: Implement a general purpose Plugin::AskDNS,
makes Spamhaus DWL possible
  Sending build/parse-rules-for-masses
  Sending lib/Mail/SpamAssassin/Conf.pm
  Sending lib/Mail/SpamAssassin/Message/Metadata.pm
  Adding  lib/Mail/SpamAssassin/Plugin/AskDNS.pm
  Adding  rules/v340.pre
Committed revision 1040666.
Comment 2 Mark Martinec 2010-12-02 20:58:46 UTC
trunk:
  Bug 6518: add POD documentation to Plugin::AskDNS
Sending lib/Mail/SpamAssassin/Plugin/AskDNS.pm
Committed revision 1041671.




=head1 NAME

AskDNS - form a DNS query using tag values, and look up the DNSxL lists

=head1 SYNOPSIS

  loadplugin  Mail::SpamAssassin::Plugin::AskDNS
  askdns DKIMDOMAIN_IN_DWL _DKIMDOMAIN_._vouch.dwl.spamhaus.org TXT /\ball\b/

=head1 DESCRIPTION

Using a DNS query template as specified in a parameter of the askdns rule,
the plugin replaces tag names as found in the template with their values as
soon as they become available, and launches DNS queries. When DNS responses
trickle in, filters them according the requested DNS resource record type
and optional subrule filtering expression, yielding a rule hit if a response
meets filtering conditions.

=head1 USER SETTINGS

=over 4

=item rbl_timeout t [t_min] [zone]              (default: 15 3)

The rbl_timeout setting is common to all DNS querying rules. It can
specify a DNS query timeout globally, or individually for each zone.
See the C<Mail::SpamAssassin::Conf> POD for details on C<rbl_timeout>.

=head1 RULE DEFINITIONS

=over 4

=item askdns NAME_OF_RULE query_template [rr_type [subqueryfilter]]

A query template is a string which will be expanded to produce a domain name
to be used in a DNS query. The template may include SpamAssassin tag names,
which will be replaced with their values to form the final query domain.
The final query domain must adhere to rules governing DNS domains, i.e. must
consist of labels each up to 63 characters long, delimited by dots. There
may be a trailing dot at the end, but it is redundant / carries no semantics,
because SpamAssassin uses a Net::DSN::Resolver::send method for querying
DNS, which ignores any 'search' or 'domain' DNS resolver options.
Domain names in DNS queries are case-insensitive.

A tag name is a string of capital letters, preceded and followed by an
underscore character. This syntax mirrors the add_header setting, except that
tags cannot have parameters in parenthesis when used in askdns templates.
Tag names may appear anywhere in the template - each queried DNS zone
prescribes how a query should be formed.

A query template may contain any number of tag names including none,
although in the most common anticipated scenario exactly one tag name would
appear in each askdns rule. Specified tag names are considered dependencies.
Askdns rules with dependencies on the same set of tags are grouped, and all
queries in a group are launched as soon as all their dependencies are met,
i.e. when the last of the awaited tag values becomes available by a call
to set_tag() from some other plugin or elsewhere in the SpamAssassin code.

Launched queries from all askdns rules are grouped too according to a pair
of: RR type and expanded query domain name. Even if there are multiple rules
producing the same type/domain pair, only one DNS query is launched, and
a reply to such query contributes to all the constituent rules.

A tag may produce none, one or multiple values. Askdns rules waiting for
a tag which never receives its value never result in a DNS query. Tags which
produce multiple values will result in multiple queries launched, each with
an expanded template using one of the tag values. An example is a DKIMDOMAIN
tag which yields a list of signing domains, one for each valid signature in
a message signed by more than one domain.

When more than one tag name appears in a template, each potentially resulting
in multiple values, a Cartesian product is formed, and each tuple results in
a launch of one DNS query (duplicates excluded). For example, a query template
_A_._B_.example.com where tag A is a list (11,22) and B is (xx,yy,zz),
will result in queries: 11.xx.example.com, 22.xx.example.com,
11.yy.example.com, 22.yy.example.com, 11.zz.example.com, 22.zz.example.com .

The parameter following the query template is a DNS resource record (RR)
type. A DNS result may bring resource records of multiple types, but only
those resource records matching the type specified in a rule are considered,
returned resource records with non-matching types are ignored for this rule.
Currently the RR type parameter also determines the DNS query types (not
just the filter for the result), although in future similar queries could
be combined, launching a query of type 'ANY'. Currently allowed RR types
are: A, AAAA, MX, TXT, PTR, NS, SOA, CNAME, HINFO, MINFO, WKS, SRV, SPF.

The last optional parameter of a rule is filtering expression, a.k.a. a
subrule. Its function is much like the subrule in URIDNSBL plugin rules
(like in the uridnssub rules), or in the check_rbl eval rules. The main
difference is that with askdns rules there is no need to manually group
rules according to their queried zone, as the grouping is automatic and
duplicate queries are implicitly eliminated.

The subrule filtering parameter can be: a plain string, a regular expression,
a single numerical value. or a pair of numerical values. Absence of the
filtering parameter implies no filtering, i.e. any positive DNS response
of the requested RR type will result in a rule hit, regardless of the RR
value returned with the response.

When a plain string is used as a filter, it must match the response exactly.
Typical use is an exact text string for TXT queries.

A regular expression follows a familiar perl syntax like /.../ or m{...}
optionally followed by regexp flags (such as 'i' for case-insensitivity).
If a DNS response matches the requested RR type and the regular expression,
the rule hits. Typical use: /^127\.0\.0\.\d+$/ or m{\bdial up\b}i .

A single numerical value can be a decimal number, or a hexadecimal number
prefixed by 0x. Such numeric filtering expression is typically used with
RR type-A DNS queries. The returned value (IP address) is masked with the
specified filtering value, and the rule hits if the result is nonzero:
(r & n) != 0 .  An example: 0x10 .

A pair of numerical values (each a decimal, hexadecimal or quad-dotted)
delimited by a '-' specifies an IP address range, and a pair of values
delimited by a '/' specifies an IP address followed by a bitmask. Again,
this type of filtering expression is primarily intended with RR type-A
DNS queries. The rule hits if the returned IP address falls within the
specified range: (r >= n1 && r <= n2), or masked with a bitmask matches
the specified value: (r & m) == (n & m) .  As a shorthand notation,
a single quad-dotted value is equivalent to a n/32 form, i.e. it must
match the returned value exactly with all its bits.

Some typical examples of a numeric filtering parameter are: 127.0.1.2,
127.0.1.20-127.0.1.39, 127.0.1.0/255.255.255.0, 0.0.0.16/0.0.0.16,
0x10/0x10, 16, 0x10 .
Comment 3 Mark Martinec 2010-12-03 09:21:33 UTC
trunk:
  fix POD warning in AskDNS.pm; add missing v340.pre to Makefile.PL
Sending Makefile.PL
Sending lib/Mail/SpamAssassin/Plugin/AskDNS.pm
Committed revision 1041828.
Comment 4 Mark Martinec 2010-12-14 21:44:18 UTC
trunk:
  Bug 6518: Plugin::AskDNS now also accepts DNS rcode as a filtering
  subrule, making it possible to distinguish a NXDOMAIN from other failures;
  provide tags FIRSTTRUSTEDREVIP and LASTEXTERNALREVIP (first approx)
Sending lib/Mail/SpamAssassin/Message/Metadata.pm
Sending lib/Mail/SpamAssassin/Plugin/AskDNS.pm
Committed revision 1049394.

Added/updated documentation section on the filtering parameter (subrule):

[...]
The last optional parameter of a rule is filtering expression, a.k.a. a
subrule. Its function is much like the subrule in URIDNSBL plugin rules,
or in the check_rbl eval rules. The main difference is that with askdns
rules there is no need to manually group rules according to their queried
zone, as the grouping is automatic and duplicate queries are implicitly
eliminated.

The subrule filtering parameter can be: a plain string, a regular expression,
a single numerical value or a pair of numerical values, or a list of rcodes
(DNS status codes of a response). Absence of the filtering parameter implies
no filtering, i.e. any positive DNS response (rcode=NOERROR) of the requested
RR type will result in a rule hit, regardless of the RR value returned with
the response.

When a plain string is used as a filter, it must be enclosed in single or
double quotes. For the rule to hit, the response must match the filtering
string exactly, and a RR type of a response must match the query type.
Typical use is an exact text string for TXT queries, or an exact quad-dotted
IPv4 address. In case of a TXT or SPF resource record which can return
multiple character-strings (as defined in Section 3.3 of [RFC1035]), these
strings are concatenated with no delimiters before comparing the result
to the filtering string. This follows requirements of several documents,
such as RFC 5518, RFC 4408, RFC 4871, RFC 5617.  Examples: "127.0.0.1",
"transaction", 'list' .

A regular expression follows a familiar perl syntax like /.../ or m{...}
optionally followed by regexp flags (such as 'i' for case-insensitivity).
If a DNS response matches the requested RR type and the regular expression,
the rule hits. Examples: /^127\.0\.0\.\d+$/, m{\bdial up\b}i .

[...] (text on numerical forms)

Lastly, the filtering parameter can be a comma-separated list of DNS status
codes (rcode), enclosed in square brackets. Rcodes can be represented either
by their numeric decimal values (0=NOERROR, 3=NXDOMAIN, ...), or their names.
See http://www.iana.org/assignments/dns-parameters for the list of names. When
testing for a rcode where rcode is nonzero, a RR type parameter is ignored
as a filter, as there is typically no answer section in a DNS reply when
rcode indicates an error.  Example: [NXDOMAIN], or [FormErr,ServFail,4,5] .
Comment 5 Mark Martinec 2011-06-20 18:46:16 UTC
One enhancement to the AskDNS plugin, and two minor tweaks:

trunk:
- Bug 6518: AskDNS: allow a list of rr_types (including ANY) in askdns directive;
- set_tag tweaks in DNS and ASN plugins: make use of a listref as a tag value;
- let URIDNSBL make available its list of URI hosts and domains as;
  Sending lib/Mail/SpamAssassin/PerMsgStatus.pm
  Sending lib/Mail/SpamAssassin/Plugin/ASN.pm
  Sending lib/Mail/SpamAssassin/Plugin/AskDNS.pm
  Sending lib/Mail/SpamAssassin/Plugin/DKIM.pm
  Sending lib/Mail/SpamAssassin/Plugin/URIDNSBL.pm
Committed revision 1137734.


This makes it possible to make rules to query for existence in DNS
of domain names or a host names as found in URIs. For example:

  askdns L_URI_NXHOST      _URIHOSTS_    A,AAAA [NXDOMAIN]
  askdns L_URI_NXDOMAIN    _URIDOMAINS_  ANY [NXDOMAIN]
  askdns L_URI_NXDOMAIN_NS _URIDOMAINS_  NS [NXDOMAIN]

As it turns out from a bit of watching our logs, seems the
M::S::PerMsgStatus::get_uri_detail_list() is overly aggressive in
finding URIs in mail, python filenames end up as Paraguayan domains,
and perl filenames as Polish domains. Also spammers and half-literate
users often leave out a space after a fullstop, which makes the last
word of a sentence join with the first word of the next sentence
separated by a dot, which is often picked up as an URI. Anyway, this
is unrelated (perhaps a subject for some other but report) - I just
wanted to illustrate the first impression with the above experimental
rules.

The AskDNS plugin is mostly complete now - there may be a tweak or
two still to come, bot nothing major. As this PR mainly serves
documentation purposes, I'll just close it now.