Bug 6645 - Problem with detection of authenticated mails coming from servers running qmail-scanner
Summary: Problem with detection of authenticated mails coming from servers running qma...
Status: RESOLVED FIXED
Alias: None
Product: Spamassassin
Classification: Unclassified
Component: Rules (show other bugs)
Version: 3.3.2
Hardware: All All
: P2 major
Target Milestone: Undefined
Assignee: SpamAssassin Developer Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-08-05 22:13 UTC by spamassassinbugs
Modified: 2019-08-01 06:44 UTC (History)
5 users (show)



Attachment Type Modified Status Actions Submitter/CLA Status
Draft at which place the correct code has to be inserted patch None spamassassinbugs@htl-leonding.ac.at [NoCLA]
Header for testing purposes text/plain None spamassassinbugs@htl-leonding.ac.at [NoCLA]
Patch against released version 3.3.2 (cleaning up) patch None spamassassinbugs@htl-leonding.ac.at [NoCLA]
Patch against released version 3.3.2 (cleaning up, now the right version) patch None spamassassinbugs@htl-leonding.ac.at [NoCLA]

Note You need to log in before you can comment on or make changes to this bug.
Description spamassassinbugs 2011-08-05 22:13:41 UTC
A user using an IP-address being i.e. blacklisted on the zen.spamhaus.org-RBL (in this example 188.45.128.1) sends a mail using an authenticated connection (notice the expression "ESMTPA"). This server (firstmailserver.domain.local) is running qmail-scanner which will forward the mail because the user was authenticated and so spamassassin is bypassed on this server. Unfortunately qmail-scanner inserts an own mail-header (line Received: from 188.45.128.1). 

When the mail arrives at the target (which will see the appended mail) and the target is running spamassassin (tested using 3.3.1 and 3.3.2), it doesn't recognize that user1 was authenticated because of the line inserted by qmail-scanner (deleting it for testing purposes will show that) and so it checks the user's dial-up IP against the RBL and marks the mail as evil spam - especially if the IP is on several RBLs which is not unusual nowadays. In this case only one RCVD_IN_PBL is triggered, but I have seen other dial-up IPs hitting several RBL-related scores causing spamassassin to create high spam scores causing the mail to be rejected.

I'm unsure whether this is the fault of qmail-scanner (tested versions 2.05 and 2.08) (creating the line) or of spamassassin (not recognizing the line).

http://wiki.apache.org/spamassassin/DynablockIssues states that the devs should be informed when spamassassin doesn't support the authentication method (without telling how), so I submitted this report.

Return-Path: <user1@domain.local>
Received: from unknown (HELO firstmailserver.domain.local) (192.168.0.10)
  by 0 with ESMTPS (DHE-RSA-AES256-SHA encrypted); 29 Apr 2011 18:34:01 -0000
Received: (qmail 19866 invoked by uid 210); 29 Apr 2011 18:23:54 -0000
Received: from 188.45.128.1 (user1@domain.local@188.45.128.1) by firstmailserver (envelope-from <user1@domain.local>, uid 201) with qmail-scanner-2.05st
 (clamdscan: 0.97/13021. spamassassin: 3.3.1. perlscan: 2.05st.
 Clear:RC:1(188.45.128.1):.
 Processed in 0.013931 secs); 29 Apr 2011 18:23:54 -0000
Received: from unknown (HELO ?127.0.0.1?) (user1@domain.local@188.45.128.1)
  by 0 with ESMTPA; 29 Apr 2011 18:23:53 -0000
Message-ID: <4DBB055E.8040102@domain.local>
Date: Fri, 29 Apr 2011 20:37:18 +0200
From: User1 <user1@domain.local>
To: user2@otherdomain.local
Subject: Testmail

Test
Comment 1 AXB 2011-08-05 22:29:06 UTC
FTR: 188.45.128.1 is not blacklisted - PBL is a policy list which contains IPs which should not send mail to the net

http://www.spamhaus.org/pbl/query/PBL337163

inetnum: 188.45.128.0 - 188.45.255.255
netname: MOBILKOM-MOBILEPOOL9
descr: MobilePools
country: AT

Your sample doesn't show the Received header added by the relay used by sender.
(or am I missing something?)
Comment 2 spamassassinbugs 2011-08-06 10:28:54 UTC
First thanks for the quick reply.

The term "blacklisted" might not technically correct, but the entry on the zenhaus-PBL combined with spamassassin not being aware of the authentication causes a penalty score, which can cause a legitimate mail - especially if the originator-IP is on several PBLs - to be effectively seen as spam going to be rejected, which would not happen when spamassassin would correctly check the mail. 

So in my opinion spamassassin's rules combined with the wrong detection of the authentication cause the mail to be effectively blacklisted. As the PBL-checks depend on the activation of the RBL-checks I mixed up those two terms. But if you want to use another technically correct term for it, you are free to do so, but please focus on the real problem that spamassassin doesn't recognize the message of the ESMTPA line because of the second received header inserted by qmail-scanner (the from 188.45.128.1 ... line). 

Following header was added by the target mailserver which received the mail from the "relay" firstmailserver.domain.local:
Received: from unknown (HELO firstmailserver.domain.local) (192.168.0.10)
  by 0 with ESMTPS (DHE-RSA-AES256-SHA encrypted); 29 Apr 2011 18:34:01 -0000


Received: from 188.45.128.1 (user1@domain.local@188.45.128.1) by
firstmailserver (envelope-from <user1@domain.local>, uid 201) with
qmail-scanner-2.05st
 (clamdscan: 0.97/13021. spamassassin: 3.3.1. perlscan: 2.05st.
 Clear:RC:1(188.45.128.1):.
 Processed in 0.013931 secs); 29 Apr 2011 18:23:54 -0000
Received: from unknown (HELO ?127.0.0.1?) (user1@domain.local@188.45.128.1)
  by 0 with ESMTPA; 29 Apr 2011 18:23:53 -0000

was the header created by the relay (you can see that it states that it received the mail from 188.45.128.1 and that it itself is firstmailserver and that this line was created "with qmail-scanner").

You can also see the first received header stating the  mail was received from the mail client on 188.45.128.1 identified as user1@domain.local authenticated with ESMTPA.
Comment 3 spamassassinbugs 2011-08-06 13:09:15 UTC
Correction of comment 2 - the relay inserts three received headers:

Received: (qmail 19866 invoked by uid 210); 29 Apr 2011 18:23:54 -0000
Received: from 188.45.128.1 (user1@domain.local@188.45.128.1) by
firstmailserver (envelope-from <user1@domain.local>, uid 201) with
qmail-scanner-2.05st
 (clamdscan: 0.97/13021. spamassassin: 3.3.1. perlscan: 2.05st.
 Clear:RC:1(188.45.128.1):.
 Processed in 0.013931 secs); 29 Apr 2011 18:23:54 -0000
Received: from unknown (HELO ?127.0.0.1?) (user1@domain.local@188.45.128.1)
  by 0 with ESMTPA; 29 Apr 2011 18:23:53 -0000

Forgot to assign the Received: (qmail ... line to the relay

A qmail-server which not running qmail-scanner would only insert
Received: (qmail 19866 invoked from network); 29 Apr 2011 18:23:54 -0000
Received: from unknown (HELO ?127.0.0.1?) (user1@domain.local@188.45.128.1)
  by 0 with ESMTPA; 29 Apr 2011 18:23:53 -0000
which doesn't confuse spamassassin on the target.

Notice the difference between "invoked from network" when qmail-scanner is not used and "invoked by uid xxx" when qmail-scanner is used.

The problem is that the received-header created on the relay by qmail-scanner:

Received: from 188.45.128.1 (user1@domain.local@188.45.128.1) by
firstmailserver (envelope-from <user1@domain.local>, uid 201) with
qmail-scanner-2.05st
 (clamdscan: 0.97/13021. spamassassin: 3.3.1. perlscan: 2.05st.
 Clear:RC:1(188.45.128.1):.
 Processed in 0.013931 secs); 29 Apr 2011 18:23:54 -0000

seems to break spamassassins trust-path as it doesn't recognize the connection to the received-header in the line before which states that ESMTPA was used.

This problem occurs when a dial-up user sends a mail using i.e. his company's mailserver running qmail-scanner as relay to a user on another domain which uses spamassassin to filter spam where the mail will not be accepted because of the broken trust path.
Comment 4 Giampaolo Tomassoni 2011-08-06 13:56:43 UTC
You seem right, the problem is about a broken trust path.

However, it is introduced by qmail-scanner, when puts a 'Received:' like the one you pointed out, and messes things not reporting a with a (whatever)A token.

qmail-scanner shouldn't report the 188.45.128.1 address at all, after all: it got the message from localhost, not from there.

I would suggest to look at qmail-scanner settings in order to either suppress to emission of that 'Received:', or at least let it report 127.0.0.1 as the source, which would lead SA to discard it.

If there is no such setting, I would report the problem to the qmail-scanner stream: SA only spotted the problem. It isn't the source of it, after all.
Comment 5 spamassassinbugs 2011-08-06 17:37:30 UTC
I think as qmail-scanner is inserted before qmail-queue it sees itself as part of the qmail-system, which received the mail from an external client as it is not locally generated, but it should add the right header (but I haven't read into the source to find out whether it can detect the authentication or just the IP from which the server got the mail).

Regarding your idea to fix the problem in the qmail-scanner sources, I have a major point for you to consider:

* The problem becomes critical at the target using spamassassin because there spamassassin penalty scores the mail (possibly causing it to be dropped), but the admin of the target mail server has no direct possibility to fix the problem on the relay, because it is not necessarily under his control. He can only fix the problem caused when he is running qmail-scanner and mails from his users are dropped somewhere else, he can't fix it on the relay but is blamed for his server rejecting legitimate mails (and do you want to check lots of possible communication partners servers whether they are running a non-fixed qmail-scanner version and argue with their admins (having RBL-checking disabled, claiming there is no problem)?)

In my opinion both ends should be fixed (spamassassin should have a workaround for detecting the trust path of "broken" qmail-scanner servers) and future qmail-scanner versions should either not insert this header or do it in the right way.
Comment 6 Giampaolo Tomassoni 2011-08-06 22:01:43 UTC
(In reply to comment #5)
> I think as qmail-scanner is inserted before qmail-queue it sees itself as part
> of the qmail-system, which received the mail from an external client as it is
> not locally generated,

 This seems a wrong assumption by qmail-scanner: the 'Received:' header from the MX running the scanner is right below the qmail-scanner one.


> but it should add the right header (but I haven't read
> into the source to find out whether it can detect the authentication or just
> the IP from which the server got the mail).

 I believe it shouldn't detect anything at all: authentication or source identification are not its purposes.


> Regarding your idea to fix the problem in the qmail-scanner sources, I have a
> major point for you to consider:
> 
> * The problem becomes critical at the target using spamassassin because there
> spamassassin penalty scores the mail (possibly causing it to be dropped), but
> the admin of the target mail server has no direct possibility to fix the
> problem on the relay, because it is not necessarily under his control. He can
> only fix the problem caused when he is running qmail-scanner and mails from his
> users are dropped somewhere else, he can't fix it on the relay but is blamed
> for his server rejecting legitimate mails (and do you want to check lots of
> possible communication partners servers whether they are running a non-fixed
> qmail-scanner version and argue with their admins (having RBL-checking
> disabled, claiming there is no problem)?)

 well, I believe that a customer running its own mail server should take its own responsabilities with respect to what is going out of its MX.  Also, say you get SA 'fixed' with respect to this issue. What about  the other spam-checking software available? How they behave with those messages? Are you going to fix all the receivers in the world? I would rather try to fix the sources...


> In my opinion both ends should be fixed (spamassassin should have a workaround
> for detecting the trust path of "broken" qmail-scanner servers) and future
> qmail-scanner versions should either not insert this header or do it in the
> right way.

SA could disregard receiveds with 'with qmail-scanned' in it. That would fix the issue as longo as those headers don't convey any useful information. This is a dangerouse hack, however, because there may be cases in which that line does convey useful data: we can't exclude this a priori. So your suggestion doesn't look too strong to me.

But this is me, after all. The fastest way to get something fixed in SA is to write a patch for it
Comment 7 D. Stussy 2011-08-07 00:50:00 UTC
Is this really a SpamAssassin problem, or is it tied to qmails' RFC noncompliant "Received:" headers?

1)  Although RFC 5322 does syntactically permit an empty subfield (after CNFS stripping) before the semicolon and time stamp, such serves no semantic purpose under the RFC.

2)  IP address literals in the "from" and "by" clauses (not counting the CNFS comments) must be in brackets because the ABNF specification calls for a domain name literal.  Unbracketed IP addresses [outside comments] are a syntax error.

3)  "[W]ith qmail-scanner-2.05st" is an unauthorized transport method per the IANA registry for messages.

4)  I'm not certain that parentheses within comments (i.e. comment nesting) is permitted in the ABNF syntax of RFC 5322 (e.g. your third "Received:" header).

For these reasons, I would oppose any "workaround" as qmail is clearly broken software.  RFC noncompliant software (especially when it comes to RFCs 5321 and 5322, where their predecessors, 2821 and 2822 are part of STD 10, and the newer versions don't really change much) should be scored in a negative manner (i.e. toward spam - with a positive score).  Therefore, the failure to parse properly which in this case yields a higher score is a proper result.


I'm not singling out qmail:  Microsoft Exchange has its own problems with RFC noncompliant "Received:" headers (e.g. 'with Microsoft SMTPSVC' which is incorrect; 'with "Microsoft SMTPSVC"' is correct - i.e. "with" takes an ABNF "atom" or an ABNF "quoted-string" only; unquoted multiple words are erroneous).  SA should score toward spam EVERY RFC-noncompliant element found.
Comment 8 spamassassinbugs 2011-08-07 14:32:43 UTC
Regarding comment 6:
When a mail is sent authenticated via a relay, spamassassin should never check the dial-up IP against a PBL. The failure to detect this is a bug (at least in my opinion), because it causes legitimate mail to be detected as spam.

It might be qmail-scanner's fault to insert an incorrect header and that should be fixed too, but it is spamassassin not detecting that the mail was authenticated and so it does things it should not do.

Regarding comment 7 sounding like "we won't support qmail/non-RFC-compliant software":
Scan through your tarball and you will find that there are some pieces of code dealing with qmail or qmail-scanner included. The tarball even contains a qmail-spamc.c file as replacement of qmail-scanner (but without capability for virus checking).

Spamassassin has no problem detecting qmail's "with ESMTPA"-header. It only chokes about qmail-scanner's additional received header. It seems that there is no code combining the line "with ESMTPA" line with the according Received: ... with qmail-scanner header for correctly checking trust path.

Regarding your idea to penalty score all non-RFC software you are free to do so, but do this only on your own mail system, because otherwise you will make many of your users unhappy (because of the sudden misdetection of many legitimate mail (from companies using i.e. Exchange) as spam) causing spamassassin to lose its reputation as software of outstanding quality (no user without technical knowledge would understand your decision nor would the admins having to deal with lost/undelivered mails, having to change spamassassin back to "normal"). 

Regarding all comments:
It seems to me that until now nobody knowing how to fix this bug by patching [SPAMASSASSIN-TAR]/lib/Mail/SpamAssassin/Message/Metadata/Received.pm reads this bug report. In my opinion the fix would be to make spamassassin combine the qmail-scanner line internally with the previously inserted authentication line into one to correctly detect the authentication.

If the unwillingness to fix it is because I assigned the bug to the rules-component (where maybe nobody is responsible for the Received.pm), then please tell me to which component it should be assigned or reassign it.

Thanks for your interest in taking bug reports serious and seeing them as a way to improve the detection engine.
Comment 9 Giampaolo Tomassoni 2011-08-07 16:35:24 UTC
I agree with Stussy, except for the fact I believe SA is an e-mail categorizing software, not an RFC compliance tester.

I'm not a PMC, anyway my belief is that SA could apply some work-around technique to the problem *afflicting qmail-scanner*, iff numbers are large enough to justify the risk of possibly introducing a weakness in the chain-of-trust detection alghoritm *and* the qmail-scanner people is unwilling to fix their bits. Otherwise, there is either no evident reason to do it and/or not enough numbers to justify it.

Please also note that the SA userbase is large, consisting of lots of different SA versions running. Updating SA may not effectively fix the problem for a very long time, while it is probably much more efficient to fix it at the source.

If I was a PMC, I would probably had closed this bug as INVALID...
Comment 10 spamassassinbugs 2011-08-07 16:46:11 UTC
[SPAMASSASSIN-TAR]/lib/Mail/SpamAssassin/Message/Metadata/Received.pm
contains code which should handle the additional qmail-scanner header, but
that is not reached and doesn't trigger.

I used spamassassin 3.3.1 for experimenting. There the code for catching the
case # Received from x1-6-00-04-bd-d2-e0-a3-k317.webspeed.dk ... seems to catch
the qmail-scanner line jumping to enough in line 1229, bypassing the
qmail-scanner handling code in 1092 and the code in line 1136 (later would
probably the right qmail-scanner code?!).

So spamassassin contains code which should handle qmail-scanner, but that is
never called ... (I think that is the definition of a BUG). 

BTW: I was using perl 5.12 for testing purposes so this might be causing
additional trouble, but as spamassassin 3.3.2 doesn't work correctly either
this is not the root cause.
Comment 11 spamassassinbugs 2011-08-07 16:52:45 UTC
(In reply to comment #9)
> I agree with Stussy, except for the fact I believe SA is an e-mail categorizing
> software, not an RFC compliance tester.
> 
> I'm not a PMC, anyway my belief is that SA could apply some work-around
> technique to the problem *afflicting qmail-scanner*, iff numbers are large
> enough to justify the risk of possibly introducing a weakness in the
> chain-of-trust detection alghoritm *and* the qmail-scanner people is unwilling
> to fix their bits. Otherwise, there is either no evident reason to do it and/or
> not enough numbers to justify it.
> 
> Please also note that the SA userbase is large, consisting of lots of different
> SA versions running. Updating SA may not effectively fix the problem for a very
> long time, while it is probably much more efficient to fix it at the source.
> 
> If I was a PMC, I would probably had closed this bug as INVALID...

I'm trying to make the problem clear to the qmail-scanner developer since your suggestion in comment 4 that it might not only affect spamassassin. 

But I don't think that the idea that in the meantime - until all qmail-scanner's are fixed, if the developer accepts it as a bug of his software - spamassassin should have a "workaround" is crazy, because it HAS a workaround, but that workaround is broken ... and so spamassassin is buggy.
Comment 12 spamassassinbugs 2011-08-07 17:44:08 UTC
> I used spamassassin 3.3.1 for experimenting. There the code for catching the
> case # Received from x1-6-00-04-bd-d2-e0-a3-k317.webspeed.dk ... seems to catch
> the qmail-scanner line 

In the qmail-scanner line there is "(user1@domain.local@188.45.128.1)" which causes the regular expression to match.

Received: from 188.45.128.1 (user1@domain.local@188.45.128.1) by
firstmailserver (envelope-from <user1@domain.local>, uid 201) with
qmail-scanner-2.05st

If I remove the "(user1@domain.local@188.45.128.1)" from qmail-scanner's received line, spamassassin seems to correctly handle the mail and triggers the qmail-scanner-handling-code.

I'm unsure whether the "(user1@domain.local@188.45.128.1)" is a feature of the newer qmail-scanner versions (and spamassassin has not been adjusted) or if the examples in Received.pm were incomplete (only using unauthenticated sessions). 

--> A new qmail-scanner check would have to be moved before the x-1-6-00-04-bd-d2-e0-a3-k317 check to fire. 
--> The qmail-scanner-expression would have to be rewritten to something like
if (!\S+ \(.{0,100}\) by \S+ \(.{0,100}\) with qmail-scanner/)
originally it was
if (!\S+ by \S+ \(.{0,100}\) with qmail-scanner/)

the $envfrom= expressions behind might also need adjustment to contain something useful.
Comment 13 Giampaolo Tomassoni 2011-08-07 18:03:35 UTC
Ok, let try a quick hack for this. Try commenting out line 1143  in http://svn.apache.org/viewvc/spamassassin/tags/spamassassin_release_3_3_1/lib/Mail/SpamAssassin/Message/Metadata/Received.pm?revision=923785 (match this line in your version) and check if this works.

If it does, please report here. Some developer/PMC may eventually like to revise that pice of code.
Comment 14 spamassassinbugs 2011-08-07 20:10:09 UTC
Created attachment 4945 [details]
Draft at which place the correct code has to be inserted

The patch works, but surely doesn't work correctly, because the $envfrom ... lines were copied from a different regex occuring later in the Received.pm
Comment 15 spamassassinbugs 2011-08-07 20:13:14 UTC
(In reply to comment #13)
> Ok, let try a quick hack for this. Try commenting out line 1143  in
> http://svn.apache.org/viewvc/spamassassin/tags/spamassassin_release_3_3_1/lib/Mail/SpamAssassin/Message/Metadata/Received.pm?revision=923785
> (match this line in your version) and check if this works.
> 
> If it does, please report here. Some developer/PMC may eventually like to
> revise that pice of code.

Can't find anything making sense at line 1143 in my editor in this line there is
$self->{qmail_scanner_env_from} = $envfrom; # hack!
which is never executed because the regex 3 lines before doesn't match and even if it would match another regex (not responsible for this case) would match miles before this code point is reached.

See the patch, how code that matches could be inserted at the right place.
Comment 16 spamassassinbugs 2011-08-07 20:15:30 UTC
Created attachment 4946 [details]
Header for testing purposes

Here you get a draft mail for testing purposes, so that you could feed it through spamassassin yourself and see when it triggers the PBL and when not.
Comment 17 Daryl C. W. O'Shea 2011-08-07 20:35:58 UTC
I'm confused.  Receiveing SpamAssassin installations should only be doing dial-up DNSBL checks on the relay that connects to them.  They should not be doing it on relays all throughout the received chain.

Your sample header appears to be generated from systems entirely local to your network.  With 192.168.0.0/16 addresses, I'n not surprised that the initial client is being checked.  In a case like this you need to define your MSA_NETWORKS.

Please provide an actual header sample that includes the received header of the remote network to demonstrate the issue you have described.
Comment 18 spamassassinbugs 2011-08-08 07:51:38 UTC
(In reply to comment #17)
> I'm confused.  Receiving SpamAssassin installations should only be doing
> dial-up DNSBL checks on the relay that connects to them.  They should not be
> doing it on relays all throughout the received chain.
> 
> Your sample header appears to be generated from systems entirely local to your
> network.  With 192.168.0.0/16 addresses, I'm not surprised that the initial
> client is being checked.  In a case like this you need to define your
> MSA_NETWORKS.

Spamassassin won't check the sender's IP if it would correctly detect the authentication against the relay.

I don't think you have tried running "spamassassin -t < file" with file being my test header and spotted the difference that correct header detection makes. If it knows qmail-scanner-2.05st headers it seems to work correctly in either case.

> Please provide an actual header sample that includes the received header of the
> remote network to demonstrate the issue you have described.

You won't get any other header sample from me because real-world sample would look the same (imagine a big network using internal ip ranges between several dislocated departments with the feature that you don't know every other mail server in this network). As the problem depends on spamassassin not correctly detecting the header forget the internal ips at the moment, because it would work with correct detection.

It can be that this special case causes extra problems, but the source of this particular problem seems to be the missing detection for newer/authenticated qmail-scanner-headers.
Comment 19 spamassassinbugs 2011-08-08 19:50:49 UTC
BTW: Trying to install the ruleset in an offline (=no internet) installation of spamassassin causes trouble as you need the sha1-signature (you could create it yourself locally, but that doesn't count):

1. Visit your download page (http://spamassassin.apache.org/downloads.cgi?update=201106220000)
2. Try to download the sha1-signature for the "SpamAssassin sa-update rules tarball".

Downloading
http://www.apache.org/dist/spamassassin/source/Mail-SpamAssassin-rules-3.3.2-r1104058.tgz.sha1 
fails. As the rules-tarball ends with .tar.gz, the sha1-checksum-file is .tar.gz.sha1, which can be downloaded if you and enter the correct link in your browser or favorite download tool.

Maybe somebody could fix the link?
Comment 20 spamassassinbugs 2011-08-08 21:31:00 UTC
(In reply to comment #14)
> Created attachment 4945 [details]
> Draft at which place the correct code has to be inserted
> 
> The patch works, but surely doesn't work correctly, because the $envfrom ...
> lines were copied from a different regex occuring later in the Received.pm

Testing shows that the patch seems to work correctly, because the regex tries to extract the part in the brackets of envelope-from and as there are no other brackets in the header extraction works correctly.

You might have to convert the patch-file from DOS-format to Unix-format (use dos2unix or something like that to fix line endings) to be able to use "patch -p0 < patchfile" in the directory where Received.pm resides.

For Spamassassin 3.3.2 the line
@@ -857,6 +857,13 @@
would have to be replaced by
@@ -861,6 +861,13 @@

But as this is a unified patch-file patch will be happily patching even this version correctly :-)

Summary:
Received.pm already contains code for handling qmail-scanner
from 188.45.128.1 by firstmail... with qmail-scanner ...
but doesn't correctly handle the header
from 188.45.128.1 (user@domain.local@188.45.128.1) by firstmail... with qmail-scanner ...
which is created by current qmail-scanner versions when authentication has been used, the patch fixes that.

The patch can be improved by moving the old qmail-scanner handling code at the correct line (where now the patch inserts its code) and by telling perl that the expression \(.{0,100}\) might be 0..1 time between the \S+ by \S+ ...

Then there would be no change in the number of lines in the Received.pm, but only an improvement for the current expression and a reordering of the rules, so that the expression can match.
Comment 21 spamassassinbugs 2011-08-08 21:47:34 UTC
substitute the expression from the released spamassassin versions

if (/^\S+ by \S+ \(.{0,100}\) with qmail-scanner/) 

with

if (/^\S+ (\(.{0,100}\) ){0,1}by \S+ \(.{0,100}\) with qmail-scanner/)

and move the code where I would have inserted my patch it seems to handle both cases
Comment 22 spamassassinbugs 2011-08-08 21:52:59 UTC
Created attachment 4947 [details]
Patch against released version 3.3.2 (cleaning up)

This is the patch putting the existing qmail-scanner handling routines at the right position handling both cases.

Maybe the example line from this bug
from xx.xx.xx.xx (user@domain.local@xx.xx.xx.xx) by firstmailserver ...
should be inserted so that nobody forgets why there is this strange 0,1 expression, but as I hope that the patch-file is now in unix-format I won't break it again.
Comment 23 spamassassinbugs 2011-08-08 21:56:54 UTC
(In reply to comment #22)
> Created attachment 4947 [details]
> Patch against released version 3.3.2 (cleaning up)
> 
> This is the patch putting the existing qmail-scanner handling routines at the
> right position handling both cases.
> 
> Maybe the example line from this bug
> from xx.xx.xx.xx (user@domain.local@xx.xx.xx.xx) by firstmailserver ...
> should be inserted so that nobody forgets why there is this strange 0,1
> expression, but as I hope that the patch-file is now in unix-format I won't
> break it again.

With this patch I get -1.0 ALL-TRUSTED instead of RCVD_FROM_PBL et al. as it should have always been.
Comment 24 spamassassinbugs 2011-08-08 22:12:58 UTC
Created attachment 4948 [details]
Patch against released version 3.3.2 (cleaning up, now the right version)

Former version was reversed patch.
Comment 25 spamassassinbugs 2011-08-12 22:07:58 UTC
Mr. Jason Haar told me on the sourceforge-maillist qmail-scanner-general that - as far as he remembers - the Received: header was used because of Exchange 5.0 removing all other headers. 

He suggests using "X-Qmail-Scanner-Diagnostics:" instead of the received header, making spamassassin happy if this doesn't cause other trouble (don't think that Exchange 5.0 is being still used). So this will be fixed somewhen.

Unfortunately this will also only work when all distributions providing qmail-scanner (don't know how many those are) have fixed this and users are using the fixed version.

As spamassassin already contains code for handling unfixed version's use of the Received:-header for unauthenticated mails I don't understand, why it shouldn't be improved to work with authenticated mails too (as only one code block needs to be moved upwards and the authentication expression needs to be matched optionally, so both cases would work).
Comment 26 AXB 2011-08-12 22:25:03 UTC
(In reply to comment #25)
> Mr. Jason Haar told me on the sourceforge-maillist qmail-scanner-general that -
> as far as he remembers - the Received: header was used because of Exchange 5.0
> removing all other headers. 
> 
> He suggests using "X-Qmail-Scanner-Diagnostics:" instead of the received
> header, making spamassassin happy if this doesn't cause other trouble (don't
> think that Exchange 5.0 is being still used). So this will be fixed somewhen.
> 
> Unfortunately this will also only work when all distributions providing
> qmail-scanner (don't know how many those are) have fixed this and users are
> using the fixed version.
> 
> As spamassassin already contains code for handling unfixed version's use of the
> Received:-header for unauthenticated mails I don't understand, why it shouldn't
> be improved to work with authenticated mails too (as only one code block needs
> to be moved upwards and the authentication expression needs to be matched
> optionally, so both cases would work).

You do realize that Exchange 5 was released in 1997 & has been EOL for at least 10 years and won't run on any recently supported OS?

Out of curiosity: Which OS distros include qmail-scanner?
Comment 27 spamassassinbugs 2011-08-13 13:03:32 UTC
Don't think that anyone still uses Exchange 5.0, but I don't know how other servers behave regarding X-... headers.

Don't know which distros ship with qmail-scanner (at least there are some wikis about installing it (see The Linux Documentation Project or a entry on google how to install it on Ubuntu). If people install it themselves it is much more probably that they won't be running the current version ...

If you want to get to know a distro where it is in the package repository (because you want confirm the bug) you could try Gentoo.

---

I want to remember you of two interesting pages telling something about qmail/qmail-scanner and spamassassin:
http://wiki.apache.org/spamassassin/IntegratedInQmailWithQmailScanner
"SpamAssassin can be integrated into qmail system-wide using qmail-scanner."

http://spamassassin.apache.org/
"Flexible: SpamAssassin ... can be used on a wide variety of email systems including ... qmail"

If you search in your bug database you will find an entry (not from me) that qmail-scanner is often used with qmail.

Strange that spamassassin's marketing guys are advertising spamassassin's use with qmail, but that people on spamassassin's bugzilla don't want to fix spamassassin to completely support qmail-scanner, which integrates spamassassin into qmail (especially as it wouldn't be a great change).

Maybe you should discuss this with your own marketing people (and maybe with QA why support is incomplete). 

BTW: You seem to haven't fixed the download for the sha1-signatures for the rules tarball reported in comment 19 (you don't have any component seeming appropriate for a separate bug report for broken links).
Comment 28 Karsten Bräckelmann 2011-08-16 19:31:31 UTC
(In reply to comment #22)
> Maybe the example line from this bug
> from xx.xx.xx.xx (user@domain.local@xx.xx.xx.xx) by firstmailserver ...
> should be inserted so that nobody forgets why there is this strange 0,1
> expression, [...]

(In reply to comment #21)
> if (/^\S+ (\(.{0,100}\) ){0,1}by \S+ \(.{0,100}\) with qmail-scanner/)

FWIW, rather than the {0,1} range quantifier, the equivalent "optional" quantifier should be used. It's widely used and immediately obvious what it does.

  (\(.{0,100}\) )?
Comment 29 John Hardin 2011-08-16 19:56:06 UTC
(In reply to comment #28)
> (In reply to comment #21)
> > if (/^\S+ (\(.{0,100}\) ){0,1}by \S+ \(.{0,100}\) with qmail-scanner/)
> 
> FWIW, rather than the {0,1} range quantifier, the equivalent "optional"
> quantifier should be used. It's widely used and immediately obvious what it
> does.
> 
>   (\(.{0,100}\) )?

Also: While I see a \0 in the code following that RE, I don't see a \1 or $1 anywhere. Is a capturing subgroup really needed there?

  /^\S+ (?:\(.{0,100}\) )?by \S+ \(.{0,100}\) with qmail-scanner/
Comment 30 spamassassinbugs 2011-08-16 21:04:13 UTC
(In reply to comments #29 and #28)
Thanks for constructive ideas and not discussing about which mail system is the best and which shouldn't be supported.

> > (In reply to comment #21)
> > > if (/^\S+ (\(.{0,100}\) ){0,1}by \S+ \(.{0,100}\) with qmail-scanner/)
> > 
> > FWIW, rather than the {0,1} range quantifier, the equivalent "optional"
> > quantifier should be used. It's widely used and immediately obvious what it
> > does.
> > 
> >   (\(.{0,100}\) )?

(In reply to comment #28)
The regex can be surely improved. The code was rather focused on "it works" than on "the code looks pretty/is performant".

The comment was rather meant to make clear that the added expression is needed for catching authenticated headers, which had no example (and which I also forgot in the current patch :-(

> 
> Also: While I see a \0 in the code following that RE, I don't see a \1 or $1
> anywhere. Is a capturing subgroup really needed there?
> 
>   /^\S+ (?:\(.{0,100}\) )?by \S+ \(.{0,100}\) with qmail-scanner/

(In reply to comment #29)
--> don't understand the "?:" in your expression, it looks much like
my $val2= $value>5?1:0;
which is
if ($value>5) {$val2=1;}
else {$val2=0;}
what is not wanted here

shouldn't the changed expression rather look like:
/^\S+ (\(.{0,100}\) )?by \S+ \(.{0,100}\) with qmail-scanner/

so that it matches expressions that match either
/^\S+ by \S+ \(.{0,100}\) with qmail-scanner/
or
/^\S+ \(.{0,100}\) by \S+ \(.{0,100}\) with qmail-scanner/

But I haven't tested your regex, so it might be my lack of regex knowledge.

Regarding the code following the RE this was code already included in spamassassin-sources, so somebody will have had reasons for that before it was included in the sources tree (it's in the current release)? (sounds much like I wouldn't change working code, at least when I don't understand what it does ;-) )
Comment 31 Karsten Bräckelmann 2011-08-19 16:27:23 UTC
(In reply to comment #30)
> The comment was rather meant to make clear that the added expression is needed
> for catching authenticated headers, which had no example (and which I also
> forgot in the current patch :-(

Adding examples or clarification is good, not questioning that. What I was referring to was the comment to "add example to explain the strange 0,1 expression". With an emphasis on strange. My point being, that quantifier indeed is strange, since it reads "0 or more, up to 1", which *is* zero or one. Which really is "optional". No confusion there, concerning the RE. An additional example might still be in order, to explain the issue.

> > Also: While I see a \0 in the code following that RE, I don't see a \1 or $1
> > anywhere. Is a capturing subgroup really needed there?

It's not a backreference, it's an unnecessary escape.

> >   /^\S+ (?:\(.{0,100}\) )?by \S+ \(.{0,100}\) with qmail-scanner/
> 
> (In reply to comment #29)
> --> don't understand the "?:" in your expression, it looks much like

It makes it a non-capturing group.

All matches in plain parenthesis are capturing, that means they are available as \n or $n later. Whereas /(?:foo)/ is non-capturing, pure grouping, not having any effect on the numbering of $n. This is important, if the matches are used within or after the RE.
Comment 32 Henrik Krohns 2019-08-01 06:44:09 UTC
Closing old stale bug. I think it was even fixed in qmail-scanner itself.

2.10	16/Aug/2011
*	Change Received: header used to show diagnostic detail
	to X-Qmail-Scanner-Diagnostics: - that will make SpamAssassin
	happier