SA Bugzilla – Bug 6780
Existing but empty From: and To:
Last modified: 2019-05-05 13:57:51 UTC
in recent wave of spam I have seen existing but empty From: and To: headers. I mean MISSING_HEADERS does not match since these headers exist. The closest rule to match such spam is FH_FROMEML_NOTLD, but it is only for From: header. Therefore I propose something like that: header EMPTY_FROM From =~ /^\s*$/ describe EMPTY_FROM empty From: header EMPTY_TO To =~ /^\s*$/ describe EMPTY_TO empty To: or meta EMPTY_TO & EMPTY_FROM
If these rules only operate when the header is present, the I suggest the same with ALL other headers except "Bcc:" - since if present, they may not be empty. Only BCC is permitted to be empty when present, and except for mail submission agent processing, is expected ans supposed to be empty if present.
(In reply to comment #1) > If these rules only operate when the header is present, the I suggest the same > with ALL other headers except "Bcc:" - since if present, they may not be empty. > > Only BCC is permitted to be empty when present, and except for mail submission > agent processing, is expected ans supposed to be empty if present. I agree. I think it needs a meta for !MISSING_HEADERS. header __EMPTY_FROM From =~ /^\s*$/ header __EMPTY_TO To =~ /^\s*$/ header __EMPTY_CC Cc =~ /^\s*$/ meta EMPTY_TO_AND_FROM (!MISSING_HEADERS && (__EMPTY_FROM + __EMPTY_TO + __EMPTY_CC >= 2)) describe EMPTY_TO_AND_FROM Mail contains headers that are blank and shouldn't be. score EMPTY_TO_AND_FROM 1.0 Lemat, the above passes lint. Does it hit on the emails you are seeing? regards, KAM
Kevin, your rules do match the spamrun I see. Meanwhile I was also testing something different: header __HAS_FROM exists:From header __EMPTY_FROM From =~ /^\s*$/ meta EMPTY_FROM __HAS_FROM && __EMPTY_FROM describe EMPTY_FROM empty from score EMPTY_FROM 1 header __HAS_TO exists:To header __EMPTY_TO To =~ /^\s*$/ meta EMPTY_TO __HAS_TO && __EMPTY_TO describe EMPTY_TO empty to score EMPTY_TO 1 and it also did the job. But (I believe) your rule is faster.
hmm... MISSING_HEADERS is operating only on To: header: header MISSING_HEADERS eval:check_for_missing_to_header() sub check_for_missing_to_header { my ($self, $pms) = @_; my $hdr = $pms->get('To'); $hdr = $pms->get('Apparently-To') if $hdr eq ''; return 1 if $hdr eq ''; return 0; } which is not exactly identical to what I have been thinking about. And I have been thinking not about AND but OR, something like that: header __EMPTY_FROM From =~ /^\s*$/ header __EMPTY_TO To =~ /^\s*$/ header __EMPTY_CC Cc =~ /^\s*$/ header __HAS_FROM exists:From header __HAS_TO exists:To header __HAS_CC exists:CC meta EMPTY_TO_OR_FROM_OR_CC (__HAS_TO && __EMPTY_TO) || (__HAS_FROM && __EMPTY_FROM) || (__HAS_CC && __EMPTY_CC) describe EMPTY_TO_OR_FROM_OR_CC Mail contains headers that are blank and shouldn't be. score EMPTY_TO_OR_FROM_OR_CC 1.0
> hmm... MISSING_HEADERS is operating only on To: header Right, your rules set seems more to the point. If multiple (although illegal) From/To/Cc header fields are taken into account, a regexp /m flag should be used: header __HAS_FROM exists:From header __HAS_TO exists:To header __HAS_CC exists:CC header __EMPTY_FROM From =~ /^\s*$/m header __EMPTY_TO To =~ /^\s*$/m header __EMPTY_CC Cc =~ /^\s*$/m meta EMPTY_FROM_OR_TO_OR_CC (__EMPTY_FROM && __HAS_FROM) || (__EMPTY_TO && __HAS_TO) || (__EMPTY_CC && __HAS_CC) describe EMPTY_FROM_OR_TO_OR_CC Contains a header field that is blank and shouldn't be. score EMPTY_FROM_OR_TO_OR_CC 1.0 ( If we don't care to for multiple instances, a rule like header __EMPTY_FROM From !~ /\S/ might be faster. ) Btw, a __HAS_FROM rule we already have (along with __HAS_RCVD, __HAS_MESSAGE_ID, __HAS_DATE and __HAS_SUBJECT). Can't hurt to add __HAS_TO and __HAS_CC for completeness, even if it turns out they won't be used.
Moving all open bugs where target is defined and 3.4.0 or lower to 3.4.1 target
Be careful with this, MS Exchange (and, separately, MS Outlook) will note the missing To header by adding one that looks like this: To: Undisclosed recipients:; Other inbox servers and/or email clients and/or combinations of those may have other defaults. This means that mass-check runs on corpora partially constructed from infrastructure that mucks with this will give erroneous results.
Rules are not bound to a specific code release. Changing to undefined release.
I have added test rules for this to my sandbox for RuleQA testing
Rule added to sandbox with commit r1839799 on Bill's sandbox.