James Mime4j
  1. James Mime4j
  2. MIME4J-140

MIME4J-57 is not practical in its limits and incorrect in its RFC interpretation

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.6
    • Fix Version/s: 0.7
    • Component/s: None
    • Labels:
      None

      Description

      I have begun playing with Mime4j for potential use in a software project. Very quickly I found a simple email (Which i can attach) which has about 30 TO addresses. The default was to throw an exception

      Looking at MIME4J-57 the author has misunderstood the SMTP RFC 2821. Yes you are limited to 998 octets PER LINE, but you may FOLD as many 998 octet lines as you wish. Technically it's 100% legal to have a 50 megabyte header value, as long as it is folded. (per 76 or 998 rules).

      I think the limit chosen by default of 1000 is absurdly low - this should be 100000 minimum or perhaps even unlimited by default. There is something to be said for a sanity check option, for sure - but not one that is triggered so easily.

      I can also open somewhat related JIRAS if people find them of merit:

      1. Documentation - defaults should be clearly stated in MimeEntityConfig javadoc. They are not
      2. Bug - The javadocs for MimeEntityConfig claim mc.setMaxHeaderCount(-1); would defeat; this check. It does not (I worked around with Integer.MaxValue)
      3. Design Question: Should the MimeTokenStream not have a public constructor that allows MimeEntityConfig to be fed. As it was I had to create my own subclass to access the protected constructor - is there a reason for this?

      Thanks

      Example header that blew stuff up (and I think we've all seen far far worse!) - The To line triggers this

      Return-Path: <tomxypdq@hotmail.com>
      Received: from c.mx.sonic.net (c.mx.sonic.net [64.142.100.46])
      by eth0.a.lds.sonic.net (8.13.8.Beta0-Sonic/8.13.7) with ESMTP id mBT21U5h027864;
      Sun, 28 Dec 2008 18:01:30 -0800
      Received: from bay0-omc2-s13.bay0.hotmail.com (bay0-omc2-s13.bay0.hotmail.com [65.54.246.149])
      by c.mx.sonic.net (8.13.8.Beta0-Sonic/8.13.7) with ESMTP id mBT21QuA026548;
      Sun, 28 Dec 2008 18:01:30 -0800
      Received: from BAY117-W11 ([207.46.8.46]) by bay0-omc2-s13.bay0.hotmail.com with Microsoft SMTPSVC(6.0.3790.3959);
      Sun, 28 Dec 2008 18:01:26 -0800
      Message-ID: <BAY117-W1177D87B46BEFE5606716BDDE60@phx.gbl>
      Content-Type: multipart/mixed;
      boundary="03df338b-5029-48d8-84e8-34f5e171dcbd"
      X-Originating-IP: [96.228.108.66]
      From: Tommy Clark <tomxypdq@hotmail.com>
      To: <alayne.newton@thomson.com>,
      Alexandra Droman
      <alexandra_dorman_100@yahoo.com>,
      Alexis Steinkamp <lexilooo@hotmail.com>, <asteinkamp@ameritech.net>,
      <attame@msn.com>, Ben Greenberg
      <bprofane68@hotmail.com>,
      blythe gross <muppetgirl1969@yahoo.com>, <brenden@mediamystic.com>,
      <cliverping@hotmail.com>, Dae-Jin Kim
      <polykor@chollian.net>,
      Doug Arthur <dougside@yahoo.com>,
      Dox Doxiadis
      <evdoxios.doxiadis@gmail.com>, <doxiadis@princeton.edu>,
      Haidde Sprague
      <haidee.sprague@gmail.com>,
      James Lee <jcl0072@hotmail.com>, Jeff Dorman
      <bub365@aol.com>,
      <jeffejeff@gmail.com>, "Jeff Lim (E-mail)"
      <jeffreyelim@hotmail.com>,
      Jeff Moshman <jmosesian@sonic.net>, Karen Wolfe
      <kaka_2702@yahoo.com>,
      <keirabby@charter.net>, keirabby
      <keirabby@cableone.net>,
      <keirmo@yahoo.com>, Kerry Levenberg
      <kerry@levenbergs.com>,
      Kim-Chi Steger <kcsteger@aol.com>, <lornap78@hotmail.com>,
      <mbell90@sonic.net>, mike bell <mjb@gwava.com>, <myra13@aol.com>,
      Natalie Stange <nstange@nyc.rr.com>,
      karen wolfe
      <ngocbao99@yahoo.com>, <polykor@chol.com>,
      Rob Cliver
      <cliver@fulbrightweb.org>, Sharon Lee <weronron@yahoo.com>,
      the Clarks
      <bosudary@comcast.net>, Ward Breeze <wbreeze@gunder.com>,
      <whosbarley@yahoo.com>
      Subject: N More THANKS
      Date: Sun, 28 Dec 2008 18:01:25 -0800
      Importance: Normal
      In-Reply-To: <BAY117-W268E2179DDEE877110AF11DDE60@phx.gbl>
      References: <c46c51bb0812281737i256ae8depebb851f79b54c326@mail.gmail.com>
      <BAY117-W268E2179DDEE877110AF11DDE60@phx.gbl>
      MIME-Version: 1.0
      X-OriginalArrivalTime: 29 Dec 2008 02:01:26.0088 (UTC) FILETIME=[5B42CC80:01C96959]
      X-Sonic-SB-IP-RBLs: IP RBLs sorbs-spam.

        Activity

        Hide
        mike bell added a comment -

        Sorry I'm wrong about #2. No bug. Brain fart.

        Show
        mike bell added a comment - Sorry I'm wrong about #2. No bug. Brain fart.
        Hide
        Oleg Kalnichevski added a comment -

        > Yes you are limited to 998 octets PER LINE, but you may FOLD as many 998 octet lines as you wish.
        > Technically it's 100% legal to have a 50 megabyte header value, as long as it is folded. (per 76 or 998 rules).

        The document that deals with line folding is RFC822 [1]. I personally cannot find any provision in the RFC that supports this claim. My personal interpretation is that a header line can be folded to make it human readable but the total limit of 998 still applies.

        If this limit is absurdly low for real world messages, I have no problem increasing it. But there is already a config parameter one can use to override it.

        Oleg

        [1] http://www.ietf.org/rfc/rfc822.txt

        Show
        Oleg Kalnichevski added a comment - > Yes you are limited to 998 octets PER LINE, but you may FOLD as many 998 octet lines as you wish. > Technically it's 100% legal to have a 50 megabyte header value, as long as it is folded. (per 76 or 998 rules). The document that deals with line folding is RFC822 [1] . I personally cannot find any provision in the RFC that supports this claim. My personal interpretation is that a header line can be folded to make it human readable but the total limit of 998 still applies. If this limit is absurdly low for real world messages, I have no problem increasing it. But there is already a config parameter one can use to override it. Oleg [1] http://www.ietf.org/rfc/rfc822.txt
        Hide
        Markus Wiederkehr added a comment -

        RFC 822 is superseded by RFC 5322 which states that "... to deal with the 998/78 character limitations per line, the field body portion of a header field can be split into a multiple-line representation" (2.2.3. Long Header Fields).

        So I think this is a valid issue.

        @Oleg: what config parameter are you referring to? Nothing in MimeEntityConfig seems to apply.

        Show
        Markus Wiederkehr added a comment - RFC 822 is superseded by RFC 5322 which states that "... to deal with the 998/78 character limitations per line, the field body portion of a header field can be split into a multiple-line representation" (2.2.3. Long Header Fields). So I think this is a valid issue. @Oleg: what config parameter are you referring to? Nothing in MimeEntityConfig seems to apply.
        Hide
        Oleg Kalnichevski added a comment -

        @Markus

        MimeEntityConfig#getMaxLineLen()

        Oleg

        Show
        Oleg Kalnichevski added a comment - @Markus MimeEntityConfig#getMaxLineLen() Oleg
        Hide
        Markus Wiederkehr added a comment -

        No I don't think so. MimeEntityConfig#getMaxLineLen() should specify the maximum number of characters a line may have in the raw message. This is before multiple header lines get unfolded into a single line by the parser. An unfolded line may well be longer than this and I don't think we have a configuration parameter for that.

        Show
        Markus Wiederkehr added a comment - No I don't think so. MimeEntityConfig#getMaxLineLen() should specify the maximum number of characters a line may have in the raw message. This is before multiple header lines get unfolded into a single line by the parser. An unfolded line may well be longer than this and I don't think we have a configuration parameter for that.
        Hide
        Markus Wiederkehr added a comment -

        In other words MimeEntityConfig#getMaxLineLen() should be used for BufferedLineReaderInputStream as it is but we need something different (or maybe nothing at all) for AbstractEntity#fillFieldBuffer().

        Show
        Markus Wiederkehr added a comment - In other words MimeEntityConfig#getMaxLineLen() should be used for BufferedLineReaderInputStream as it is but we need something different (or maybe nothing at all) for AbstractEntity#fillFieldBuffer().
        Hide
        Oleg Kalnichevski added a comment -

        > No I don't think so

        You are very welcome to disagree but this is how it works at the moment. I have no problem with having another parameter but I do not think indefinite line folding should be allowed per default.

        Oleg

        Show
        Oleg Kalnichevski added a comment - > No I don't think so You are very welcome to disagree but this is how it works at the moment. I have no problem with having another parameter but I do not think indefinite line folding should be allowed per default. Oleg
        Hide
        Markus Wiederkehr added a comment -

        1000 is the perfect default value for MaxLineLen because "each line of characters MUST be no more than 998 characters" (RFC 5322). I believe we count the CRLF in, so 998 + 2 = 1000.

        But 1000 clearly is a very bad value for the length of an unfolded header because unfolded headers may be longer. I think the RFC is pretty clear about that.

        We should introduce another parameter, maybe MaxHeaderLen or something like that and think of a reasonable default value (50000?).

        (And obviously I wasn't arguing about how it works at the moment but how it should work.)

        Show
        Markus Wiederkehr added a comment - 1000 is the perfect default value for MaxLineLen because "each line of characters MUST be no more than 998 characters" (RFC 5322). I believe we count the CRLF in, so 998 + 2 = 1000. But 1000 clearly is a very bad value for the length of an unfolded header because unfolded headers may be longer. I think the RFC is pretty clear about that. We should introduce another parameter, maybe MaxHeaderLen or something like that and think of a reasonable default value (50000?). (And obviously I wasn't arguing about how it works at the moment but how it should work.)
        Hide
        Markus Wiederkehr added a comment -

        Patch committed, please review.

        Show
        Markus Wiederkehr added a comment - Patch committed, please review.
        Hide
        Markus Wiederkehr added a comment -

        @Mike: Regarding the other issues:

        #1 and #2 should be fixed in trunk.

        #3: I don't know why that constructor is not public. But do you really want to use MimeTokenStream directly? MimeStreamParser has a public constructor that passes the specified MimeEntityConfig to MimeTokenStream. Please file a separate JIRA if you still want that to be addressed.

        Show
        Markus Wiederkehr added a comment - @Mike: Regarding the other issues: #1 and #2 should be fixed in trunk. #3: I don't know why that constructor is not public. But do you really want to use MimeTokenStream directly? MimeStreamParser has a public constructor that passes the specified MimeEntityConfig to MimeTokenStream. Please file a separate JIRA if you still want that to be addressed.
        Hide
        Oleg Kalnichevski added a comment -

        Works for me

        Oleg

        Show
        Oleg Kalnichevski added a comment - Works for me Oleg
        Hide
        Oleg Kalnichevski added a comment -

        Marking as resolved

        Oleg

        Show
        Oleg Kalnichevski added a comment - Marking as resolved Oleg

          People

          • Assignee:
            Markus Wiederkehr
            Reporter:
            mike bell
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development