James Mime4j
  1. James Mime4j
  2. MIME4J-196

Lenient parsing of Mailadresses should be a little more lenient

    Details

    • Type: Wish Wish
    • Status: Resolved
    • Priority: Trivial Trivial
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.7
    • Component/s: parser (core)
    • Labels:
      None

      Description

      Parsing a mailaddress as in https://issues.apache.org/jira/browse/MIME4J-31 results in a ParseException. Parsing a mailaddress starting with a dot (.) results in a ParseException.
      When parsing an addressfield with multiple adresses, the Exception occuring while parsing a single address is caught and null is returned as the resulting addresslist. (this breaks tika as it expects an empty list rather than null)

      It would be nice if invalid addresses would be handled more gracefully when in lenient mode. And it would be nice if at least the correct addresses would be returned while parsing an addresslist with a corrupted address.

      I am using Mime4J via the Apache Tika project to extract text from emails for indexing in Lucene. The textstream of tika is directly read by a lucene field and indexing fails if an exception is thrown by Mime4J. This currently happens every time a headerfield contains more than 1000 characters due to tika using the unusable mime4j standardconfiguration ( https://issues.apache.org/jira/browse/TIKA-640 ), and every time a malformed emailaddress is encountered ( https://issues.apache.org/jira/browse/TIKA-641 ).

      These problems can be taken care of in Tika, but there is no way for Tika to retrieve the working mailaddresses out of a list, if Mime4j returns only none; maybe this problem could be addressed in Mime4J.

        Activity

        Hide
        Oleg Kalnichevski added a comment -

        I am working on a set of low level parsing routes that could be used to assemble more lenient / tolerant field parsers, but this issue may have to wait until 0.8

        Oleg

        Show
        Oleg Kalnichevski added a comment - I am working on a set of low level parsing routes that could be used to assemble more lenient / tolerant field parsers, but this issue may have to wait until 0.8 Oleg
        Hide
        Oleg Kalnichevski added a comment -

        There is now more lenient implementation of FieldParser in SVN trunk. When running in a lenient mode the message builder will pick up lenient field parsers per default unless configured to use a custom FieldParser implementation.

        Could you please retest your application against the latest SVN snapshot of mime4j and let me know if the problem has been resolved to your satisfaction?

        Oleg

        Show
        Oleg Kalnichevski added a comment - There is now more lenient implementation of FieldParser in SVN trunk. When running in a lenient mode the message builder will pick up lenient field parsers per default unless configured to use a custom FieldParser implementation. Could you please retest your application against the latest SVN snapshot of mime4j and let me know if the problem has been resolved to your satisfaction? Oleg

          People

          • Assignee:
            Unassigned
            Reporter:
            Jens Wilmer
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development