James Server
  1. James Server
  2. JAMES-344

FetchMail cannot parse particular format of "Received" header

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 2.3.0
    • Fix Version/s: 3.0-M2
    • Component/s: FetchMail
    • Labels:
      None

      Description

      The mail server I am pulling e-mail from inserts a "Received" header that looks like the following:

      Received: from unknown (HELO host.domain.tld) (192.168.255.254) by ...

      BTW - The name "unknown" is always used. I assume they are purposely saving processing power by not reverse-looking up the host name.

      I have debugged this problem in the code, and it appears that because the IP address is not surrounded by square brackets, computeRemoteAddress is unable to find the IP address. So the name "unknown" is always used to determine the address instead, which fails.

      FYI - The e-mail I am pulling actually passes through two e-mail servers by different organizations, and they both use this format. So I assume this format is common.

        Activity

        Hide
        Norman Maurer added a comment -

        Fixed.. now [] and () are allowed

        Show
        Norman Maurer added a comment - Fixed.. now [] and () are allowed
        Hide
        Norman Maurer added a comment -

        Move to M2

        Show
        Norman Maurer added a comment - Move to M2
        Hide
        Norman Maurer added a comment -

        Just assign to 3.0 release

        Show
        Norman Maurer added a comment - Just assign to 3.0 release
        Hide
        Ralph B Holland added a comment - - edited

        This parsing issue is also causing problems for me from my POP3 server. I had to include UnknownHostException handling into MessageProcessor to stop it crashing on an invalid trace header.

        If you look inside RFC 5321 section 4.4 Trace it states that the Received: field was added for debugging, the standard goes on to say that the field may not conform and that NO mail server should alter the format. I.e. expect it to differ from the standards.

        That comment aside though, I notice in RFC5321 that an IP address is actually specified inside ( ) and not [ ] - so it seems to me that the ( ) should be permitted.

        I believe that the parser. preference can be given to the [ ] , if not found found then the contents of the inner-most () should be used.

        I attach an example email header extracted by James fetchmail from my POP3 server, which is running windoze software at my ISP (who hosts my domain). Note the second header put on by my provider's email server canberranet.com.au:

        Return-Path: <qdyunjbeea@palace-furniture.co.uk>
        Delivered-To: ralph@localhost
        Received: from canberranet.com.au (unverified [202.168.8.70])
        by canberranet.com.au (SurgeMail 3.7b8) with ESMTP id 76651460
        for <xxxx@arising.com.au>; Fri, 01 Jan 2010 14:28:16 +1100
        Received: from unknown (HELO canberranet.com.au) (202.168.8.8)
        by Mail-SeCure (envelope-from qdyunjbeea@palace-furniture.co.uk)
        with SMTP; 1 Jan 2010 12:32:36 +1100
        Received: from ll62-72-188-251-62.ll62.iam.net.ma (unverified [62.251.188.72])
        by canberranet.com.au (SurgeMail 3.7b8) with ESMTP id 76642457
        for multiple; Fri, 01 Jan 2010 12:36:46 +1100
        X-Verify-SMTP: Host 62.251.188.72 sending to us was not listening
        Date: Fri, 01 Jan 2010 01:34:55 +0100
        Message-ID: <001101ca8a7a$3d46e520$00426158@ocutnqtp>
        From: "Ruby Royale Online Games" <qdyunjbeea@palace-furniture.co.uk>
        To: <xxx@arising.com.au>
        Subject: Great pokies and offers [spam]
        MIME-Version: 1.0
        Content-Type: text/plain; charset=iso-8859-1
        Content-Transfer-Encoding: quoted-printable
        X-Server: High Performance Mail Server - http://surgemail.com r=1325883893
        X-Rcpt-To: <xxx@arising.com.au>
        X-IP-stats: Incoming Outgoing Last 0, First 762, in=2251436, out=27829717, spam=0 Known=true
        X-External-IP: 202.168.8.70
        Status: U
        X-UIDL: 1262316496.856_419757.mrmail2
        X-fetched-from: mail.arising.com.au

        I will be looking at making changes to the Received: field parser to handle the brackets ( ) around the IP address after the HELO response..

        Just to help out further, here is the email received from the brutus.apache.org jira signup I just made:

        Return-Path: <yyy@apache.org>
        Delivered-To: ralph@localhost
        Received: from canberranet.com.au (unverified [202.168.8.70])
        by canberranet.com.au (SurgeMail 3.7b8) with ESMTP id 76659501
        for <xxx@arising.com.au>; Fri, 01 Jan 2010 16:01:24 +1100
        Received: from unknown (HELO brutus.apache.org) (140.211.11.140)
        by Mail-SeCure (envelope-from yyy@apache.org)
        with SMTP; 1 Jan 2010 14:08:32 +1100
        Received: from brutus.apache.org (localhost [127.0.0.1])
        by brutus.apache.org (Postfix) with ESMTP id 62A8D234C045
        for <xxx@arising.com.au>; Thu, 31 Dec 2009 19:08:29 -0800 (PST)
        Message-ID: <356860250.1262315309388.JavaMail.yyy@brutus.apache.org>
        Date: Fri, 1 Jan 2010 03:08:29 +0000 (UTC)
        From: yyy@apache.org
        To: xxx@arising.com.au
        Subject: [jira] Account signup
        MIME-Version: 1.0
        Content-Type: text/plain; charset=utf-8
        Content-Transfer-Encoding: 7bit
        X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394
        X-Server: High Performance Mail Server - http://surgemail.com r=1325883893
        X-Rcpt-To: <xxx@arising.com.au>
        X-IP-stats: Incoming Outgoing Last 0, First 763, in=2252824, out=27830757, spam=0 Known=true
        X-External-IP: 202.168.8.70
        Status: U
        X-UIDL: 1262322084.856_424086.mrmail2
        X-fetched-from: mail.arising.com.au

        You have signed up for a JIRA account at:

        https://issues.apache.org/jira

        Here are the details of your account:

        ....contents snipped and addresses anonymized ...

        Show
        Ralph B Holland added a comment - - edited This parsing issue is also causing problems for me from my POP3 server. I had to include UnknownHostException handling into MessageProcessor to stop it crashing on an invalid trace header. If you look inside RFC 5321 section 4.4 Trace it states that the Received: field was added for debugging, the standard goes on to say that the field may not conform and that NO mail server should alter the format. I.e. expect it to differ from the standards. That comment aside though, I notice in RFC5321 that an IP address is actually specified inside ( ) and not [ ] - so it seems to me that the ( ) should be permitted. I believe that the parser. preference can be given to the [ ] , if not found found then the contents of the inner-most () should be used. I attach an example email header extracted by James fetchmail from my POP3 server, which is running windoze software at my ISP (who hosts my domain). Note the second header put on by my provider's email server canberranet.com.au: Return-Path: <qdyunjbeea@palace-furniture.co.uk> Delivered-To: ralph@localhost Received: from canberranet.com.au (unverified [202.168.8.70] ) by canberranet.com.au (SurgeMail 3.7b8) with ESMTP id 76651460 for <xxxx@arising.com.au>; Fri, 01 Jan 2010 14:28:16 +1100 Received: from unknown (HELO canberranet.com.au) (202.168.8.8) by Mail-SeCure (envelope-from qdyunjbeea@palace-furniture.co.uk) with SMTP; 1 Jan 2010 12:32:36 +1100 Received: from ll62-72-188-251-62.ll62.iam.net.ma (unverified [62.251.188.72] ) by canberranet.com.au (SurgeMail 3.7b8) with ESMTP id 76642457 for multiple; Fri, 01 Jan 2010 12:36:46 +1100 X-Verify-SMTP: Host 62.251.188.72 sending to us was not listening Date: Fri, 01 Jan 2010 01:34:55 +0100 Message-ID: <001101ca8a7a$3d46e520$00426158@ocutnqtp> From: "Ruby Royale Online Games" <qdyunjbeea@palace-furniture.co.uk> To: <xxx@arising.com.au> Subject: Great pokies and offers [spam] MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Server: High Performance Mail Server - http://surgemail.com r=1325883893 X-Rcpt-To: <xxx@arising.com.au> X-IP-stats: Incoming Outgoing Last 0, First 762, in=2251436, out=27829717, spam=0 Known=true X-External-IP: 202.168.8.70 Status: U X-UIDL: 1262316496.856_419757.mrmail2 X-fetched-from: mail.arising.com.au I will be looking at making changes to the Received: field parser to handle the brackets ( ) around the IP address after the HELO response.. Just to help out further, here is the email received from the brutus.apache.org jira signup I just made: Return-Path: <yyy@apache.org> Delivered-To: ralph@localhost Received: from canberranet.com.au (unverified [202.168.8.70] ) by canberranet.com.au (SurgeMail 3.7b8) with ESMTP id 76659501 for <xxx@arising.com.au>; Fri, 01 Jan 2010 16:01:24 +1100 Received: from unknown (HELO brutus.apache.org) (140.211.11.140) by Mail-SeCure (envelope-from yyy@apache.org) with SMTP; 1 Jan 2010 14:08:32 +1100 Received: from brutus.apache.org (localhost [127.0.0.1] ) by brutus.apache.org (Postfix) with ESMTP id 62A8D234C045 for <xxx@arising.com.au>; Thu, 31 Dec 2009 19:08:29 -0800 (PST) Message-ID: <356860250.1262315309388.JavaMail.yyy@brutus.apache.org> Date: Fri, 1 Jan 2010 03:08:29 +0000 (UTC) From: yyy@apache.org To: xxx@arising.com.au Subject: [jira] Account signup MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Server: High Performance Mail Server - http://surgemail.com r=1325883893 X-Rcpt-To: <xxx@arising.com.au> X-IP-stats: Incoming Outgoing Last 0, First 763, in=2252824, out=27830757, spam=0 Known=true X-External-IP: 202.168.8.70 Status: U X-UIDL: 1262322084.856_424086.mrmail2 X-fetched-from: mail.arising.com.au You have signed up for a JIRA account at: https://issues.apache.org/jira Here are the details of your account: ....contents snipped and addresses anonymized ...
        Hide
        Danny Angus added a comment -

        The important factor here is that by supporting non-compliance we are eroding the value of the very standards upon which interoperability relies.
        We are in danger of encouraging a situation in which true interoperability is not only dependant upon clear standard behaviour but also upon undocumented variance from those same standards.

        The James policy for issues of non-compliance tries to tread the fine line between a pragmatic acceptance of other people's misinterpretation of the RFC's and an evangelical enforcement of "the letter of the law".

        In practice this policy is that certain well argued of cases of non-compliance which can be safely worked around, will be tolerated by James.

        In cases (like this one) where variance from a published standard is required it is desirable that this functionality is disabled by default, well documented, and only enabled by explicit configuration.

        In cases where the behaviour is not within the scope of any standard which James claims to support (such as behaviour which is a defacto standard or proposed RFC but not yet subject of an RFC in standards track) it is acceptable to implement the behaviour so long as it is adequately documented and users can be clear about what to expect from James.

        Show
        Danny Angus added a comment - The important factor here is that by supporting non-compliance we are eroding the value of the very standards upon which interoperability relies. We are in danger of encouraging a situation in which true interoperability is not only dependant upon clear standard behaviour but also upon undocumented variance from those same standards. The James policy for issues of non-compliance tries to tread the fine line between a pragmatic acceptance of other people's misinterpretation of the RFC's and an evangelical enforcement of "the letter of the law". In practice this policy is that certain well argued of cases of non-compliance which can be safely worked around, will be tolerated by James. In cases (like this one) where variance from a published standard is required it is desirable that this functionality is disabled by default, well documented, and only enabled by explicit configuration. In cases where the behaviour is not within the scope of any standard which James claims to support (such as behaviour which is a defacto standard or proposed RFC but not yet subject of an RFC in standards track) it is acceptable to implement the behaviour so long as it is adequately documented and users can be clear about what to expect from James.
        Hide
        Jeff Keyser added a comment -

        I suppose that if everyone else who runs into incompatabilities like this is a Java developer and wants to learn how to write Mailets, then that would be a second option.

        In any case, we can agree to disagree on what solution is "best." If you don't see an issue here, then go ahead and close this.

        Show
        Jeff Keyser added a comment - I suppose that if everyone else who runs into incompatabilities like this is a Java developer and wants to learn how to write Mailets, then that would be a second option. In any case, we can agree to disagree on what solution is "best." If you don't see an issue here, then go ahead and close this.
        Hide
        Steve Brewin added a comment -

        I no more expect non-compliant servers to change than I expect that we could add support for all of the flavours of non-compliant headers that exist now and in the future.

        As I suggested in my previous comment, my view is that the best way to handle this to leverage the existing functionality and handle them via a Mailet. This way, malformed headers injected by any means - fetchmail, SMTP, whatever - can be handled in the same way by the same code. Such a Mailet could use the regex technique you suggest.

        Regarding JAMES-345, its in my queue.

        Show
        Steve Brewin added a comment - I no more expect non-compliant servers to change than I expect that we could add support for all of the flavours of non-compliant headers that exist now and in the future. As I suggested in my previous comment, my view is that the best way to handle this to leverage the existing functionality and handle them via a Mailet. This way, malformed headers injected by any means - fetchmail, SMTP, whatever - can be handled in the same way by the same code. Such a Mailet could use the regex technique you suggest. Regarding JAMES-345 , its in my queue.
        Hide
        Jeff Keyser added a comment -

        I'm looking more closely at my e-mail headers, and I remembered only half-correctly. Both servers forget the square brackets when inserting the IP address, but only one Uses "unknown" instead of the actual host name it looked up.

        The header inserted by Verizon's mail server mentions "MailPass SMTP server v1.1.1 - 121803235448JY". Global Name Registry, which uses the host name "unknown" doesn't provide any clues about its software.

        As far as supporting non-compliant headers, I would suggest that it's easier to adapt to those that are non-compliant than to expect those that aren't to change. If you're concerned about going down a slipery slope, one option could be to allow the user to supply a regular expression that parses the particular header he/she is seeing.

        Since you mentioned it here, please take a closer look at the issue I actually reported in JAMES-345. That issue does not pertain to looking up a host name, but the fact that FetchMail doesn't work correctly when it can't, regardless of the reason. It just so happens that this issue brought that one to light.

        Show
        Jeff Keyser added a comment - I'm looking more closely at my e-mail headers, and I remembered only half-correctly. Both servers forget the square brackets when inserting the IP address, but only one Uses "unknown" instead of the actual host name it looked up. The header inserted by Verizon's mail server mentions "MailPass SMTP server v1.1.1 - 121803235448JY". Global Name Registry, which uses the host name "unknown" doesn't provide any clues about its software. As far as supporting non-compliant headers, I would suggest that it's easier to adapt to those that are non-compliant than to expect those that aren't to change. If you're concerned about going down a slipery slope, one option could be to allow the user to supply a regular expression that parses the particular header he/she is seeing. Since you mentioned it here, please take a closer look at the issue I actually reported in JAMES-345 . That issue does not pertain to looking up a host name, but the fact that FetchMail doesn't work correctly when it can't, regardless of the reason. It just so happens that this issue brought that one to light.
        Hide
        Steve Brewin added a comment -

        Even if on visual examination the IP address appears to be present, the format is invalid according to my reading of the RFCs.

        Unless its a well know defacto divergence from the RFC, personally I am not keen on suppporting it. Once we start on such a path, where do we stop?

        Far from being a moot point, this is the crux of the issue. How do we deal with non-compliant trace information? The current strategy is to allow the choice of rejection or propogation with a warning mail attribute attached. If the latter option is chosen, anyone is free to write a Mailet that interprets and adjust the trace information as they choose after injection into the James spool.

        As noted in JAMES-345 there may be an issue with the implementation, but the intent is in my view correct.

        Out of interest, do you know what breed of mail servers are producing this information?

        Show
        Steve Brewin added a comment - Even if on visual examination the IP address appears to be present, the format is invalid according to my reading of the RFCs. Unless its a well know defacto divergence from the RFC, personally I am not keen on suppporting it. Once we start on such a path, where do we stop? Far from being a moot point, this is the crux of the issue. How do we deal with non-compliant trace information? The current strategy is to allow the choice of rejection or propogation with a warning mail attribute attached. If the latter option is chosen, anyone is free to write a Mailet that interprets and adjust the trace information as they choose after injection into the James spool. As noted in JAMES-345 there may be an issue with the implementation, but the intent is in my view correct. Out of interest, do you know what breed of mail servers are producing this information?
        Hide
        Steve Brewin added a comment -

        Reopened using the Reopen-Issue workflow action.
        You need to be logged in. Not sure if this action is available to all users.

        Show
        Steve Brewin added a comment - Reopened using the Reopen-Issue workflow action. You need to be logged in. Not sure if this action is available to all users.
        Hide
        Jeff Keyser added a comment -

        How does one reopen a bug?

        Show
        Jeff Keyser added a comment - How does one reopen a bug?
        Hide
        Jeff Keyser added a comment -

        I guess I gave too much background information and ended up confusing the issue. I'll try to be a little more clear.

        First, the issue as I see it is that the parser is not finding the actual IP address in the header because it isn't surrounded by square brackets. If it would find this address, the invalid host name "unknown" would be moot.

        Second, what I was saying was that the e-mail I receive contains two different "Received" headers that use the format I described, inserted by two different e-mail servers from two different organizations. My point wasn't that it's valid, but that it's commonly used and may cause the same problem for other users.

        I apologize for being unclear in my first posting.

        Show
        Jeff Keyser added a comment - I guess I gave too much background information and ended up confusing the issue. I'll try to be a little more clear. First, the issue as I see it is that the parser is not finding the actual IP address in the header because it isn't surrounded by square brackets. If it would find this address, the invalid host name "unknown" would be moot. Second, what I was saying was that the e-mail I receive contains two different "Received" headers that use the format I described, inserted by two different e-mail servers from two different organizations. My point wasn't that it's valid, but that it's commonly used and may cause the same problem for other users. I apologize for being unclear in my first posting.
        Hide
        Steve Brewin added a comment -

        See RFC 2821, "4.4 Trace Information" for details of expected Received from: processing.

        In short, relaying servers should not alter existing Received from: headers so the fact that downstream servers do not reject or alter these headers is to be expected.

        "Received from: unknown" is invalid as it does not contain a valid domain literal.

        Use <remoteReceivedHeader reject=false .../> (see http://james.apache.org/fetchmail_configuration_2_2.html#remoteReceivedHeader) to control handling of such invalid domain literals.

        A fix for issue James-302 is required. This is included in James 2.2.1 RC1.

        Feel free to reopen if the above does not resolve this issue.

        – Steve

        Show
        Steve Brewin added a comment - See RFC 2821, "4.4 Trace Information" for details of expected Received from: processing. In short, relaying servers should not alter existing Received from: headers so the fact that downstream servers do not reject or alter these headers is to be expected. "Received from: unknown" is invalid as it does not contain a valid domain literal. Use <remoteReceivedHeader reject=false .../> (see http://james.apache.org/fetchmail_configuration_2_2.html#remoteReceivedHeader ) to control handling of such invalid domain literals. A fix for issue James-302 is required. This is included in James 2.2.1 RC1. Feel free to reopen if the above does not resolve this issue. – Steve

          People

          • Assignee:
            Norman Maurer
            Reporter:
            Jeff Keyser
          • Votes:
            1 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development