SA Bugzilla – Bug 6657
can't have both header existence test and header field modifier, e.g. exists:References:addr
Last modified: 2012-01-17 18:25:25 UTC
The following example rule is invalid: header __HAS_REFERENCES_ADDR exists:References:addr It generates: lint: config: invalid regexp for rule __T_AK_HAS_REFERENCES_ADDR: : missing or invalid delimiters I expected it to be identical to: header __HAS_REFERENCES_ADDR References:addr =~ /^\S/ If this functionality is undesired, the documentation should forbid it. (This is not a regression afaict.)
Reading the docs again, I'm not even positive it hints this would be supported. The exists: documentation states to test the header for existence -- the header itself. Clearly, even after stripping anything but the first email address as :addr does, this would not have an impact on the header's existence...
Actually, the docs only specifically mention exists:Header and Header:addr as special header tests, with no word suggesting Header might be anything more complicated than a plain mail header. The :addr one might be not that obvious, but exists:name_of_header specifically mentions a header name only. IMHO, this does not need to be supported, since exists: merely is "a very simple version of the above header tests" (quoting the docs) and can easily be done with an RE rule. Moreover, I believe the docs do not specifically need to forbid this, since it doesn't suggest this to be supported. Not a bug IMHO, closing RESOLVED INVALID accordingly. Please feel free to re-open the bug report, if you feel strongly about it.
(In reply to comment #2) > Actually, the docs only specifically mention exists:Header and Header:addr as > special header tests, with no word suggesting Header might be anything more > complicated than a plain mail header. > > The :addr one might be not that obvious, but exists:name_of_header specifically > mentions a header name only. > > IMHO, this does not need to be supported, since exists: merely is "a very > simple version of the above header tests" (quoting the docs) and can easily be > done with an RE rule. Moreover, I believe the docs do not specifically need to > forbid this, since it doesn't suggest this to be supported. > > Not a bug IMHO, closing RESOLVED INVALID accordingly. Please feel free to > re-open the bug report, if you feel strongly about it. I worry that "invalid regexp" and "missing or invalid delimiters" might be too vague for troubleshooting this issue. Also, I think the docs are indeed ambiguous. First, it is unclear if a blank header will match an exists: test. Second, :addr is described as modifying the header for the current rule: > Appending a modifier ":addr" to a header field name will cause everything > except the first email address to be removed from the header field. Assuming exists: does not match an empty (but still present) header, this implies that a header that does not contain an address is empty and therefore does not match exists: even if it has other data.
> I worry that "invalid regexp" and "missing or invalid delimiters" might be too > vague for troubleshooting this issue. > > Also, I think the docs are indeed ambiguous. Better? trunk: Bug 6657: tighten header rule parsing syntax check, improve error message for a header rule like "exists:References:addr" Sending lib/Mail/SpamAssassin/Conf/Parser.pm Sending lib/Mail/SpamAssassin/Conf.pm Committed revision 1166471. > it is unclear if a blank header will match an exists: test. Fixed documentation by the above commit. Now it says: header SYMBOLIC_TEST_NAME exists:header_field_name Define a header field existence test. "header_field_name" is the name of a header field to test for existence. Not to be confused with a test for an empty header field body, which can be implemented with a "header SYMBOLIC_TEST_NAME header_field_name op /pattern/modifiers" rule described above. > Assuming 'exists:' does not match an empty (but still present) header True, since 3.3.0. See Bug 5965.
Committed revision 1166476. Slight change of wording: header SYMBOLIC_TEST_NAME exists:header_field_name Define a header field existence test. "header_field_name" is the name of a header field to test for existence. Not to be confused with a test for a nonempty header field body, which can be implemented by a "header SYMBOLIC_TEST_NAME header =~ /\S/" rule as described above.
I like the documentation changes, but I think "missing or invalid delimiters" was better than the new phrasing, lint: config: SpamAssassin failed to parse line, "__HAS_REFERENCES_ADDR exists:References:addr" does not specify a valid header field name for "header", skipping: header __HAS_REFERENCES_ADDR exists:References:addr at build/mkrules line 254. $value and $line are nearly identical and therefore redundant. Maybe something like: $parse_error = "config: SpamAssassin failed to parse line, \"$def\" ". "does not specify a valid field name for \"$key\", ". "skipping: $line"; Another note: even though this fails t/lint.t (but only on the first run since a 'make clean' or rules update), 'make test' does not fail and 'spamassassin --lint' still has no output (and -D doesn't add anything on this front). Are these the desired outcomes? Separately, and I know this was in the code before Mark touched it, I dislike our use of the regex character class [!-9;-\176] which matches the two ranges from "!" to "9" and from ";" to "\176" (~, tilde). I had to look at a character map to even figure out that this is really just looking for any non-colon/non-space/non-extended printable characters (basically [^:\s] though I think other code has already handled the white space). (An extra level of confusion: POSIX regexps that begin with an exclamation point are negated.) Do we really need to look for extended or non-printing characters here? Can we instead simplify this to [^:] or [^:\s] ?
> lint: config: SpamAssassin failed to parse line, "__HAS_REFERENCES_ADDR > exists:References:addr" does not specify a valid header field name for > "header", skipping: header __HAS_REFERENCES_ADDR exists:References:addr > at build/mkrules line 254. > $value and $line are nearly identical and therefore redundant. Agreed, redundant. > Maybe something like: > $parse_error = "config: SpamAssassin failed to parse line, \"$def\" ". > "does not specify a valid field name for \"$key\", ". > "skipping: $line"; The $def is unfortunately no longer available in Parser.pm where the diagnostics is issued. So to avoid having to re-parse I just simplified the text, it now reads: $ spamassassin -L --lint Sep 14 11:17:36.188 [94102] warn: config: SpamAssassin failed to parse line, it does not specify a valid header field name, skipping: header SYMBOLIC_TEST_NAME exists:From:addr when encountering an illegal directive such as this. > Another note: even though this fails t/lint.t (but only on the first run since > a 'make clean' or rules update), 'make test' does not fail and 'spamassassin > --lint' still has no output (and -D doesn't add anything on this front). Are > these the desired outcomes? I don't understand what you are saying. The 'spamassassin --lint' does produce output when there is an error in one of the configuration files. Where is a t/lint.t ? > Separately, and I know this was in the code before Mark touched it, I dislike > our use of the regex character class [!-9;-\176] which matches the two > ranges from "!" to "9" and from ";" to "\176" (~, tilde). I had to look at a > character map to even figure out that this is really just looking for any > non-colon/non-space/non-extended printable characters The regexp was just transliterated from RFC 5322 section 3.6.8: field-name = 1*ftext ftext = %d33-57 / ; Printable US-ASCII %d59-126 ; characters not including ; ":". But I agree it is probably an overkill. Changed it to: } elsif ($header_name !~ /^([^: \t]+)$/) { # be generous trunk: Sending lib/Mail/SpamAssassin/Conf/Parser.pm Sending lib/Mail/SpamAssassin/Conf.pm Committed revision 1170488.
Can this be closed as fixed now?
Without additional comments, considering fixed with patches submitted.