Tika
  1. Tika
  2. TIKA-640

RFC822Parser should configure Mime4j not to fail reading mails containing more than 1000 chars in one headers text (even if folded)

    Details

    • Type: Wish Wish
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.9
    • Fix Version/s: 0.10
    • Component/s: parser
    • Labels:
    • Environment:

      All

      Description

      Standard configuration of Mime4j accepts only 1000 characters per line and 1000 charackters per header. The streaming approach of tika should not need theese limitations, an exception is being thrown and none of the data read is available.

      Solution:
      Replace all occurences of:

      Parser parser = new RFC822Parser();

      by:

      MimeEntityConfig config = new MimeEntityConfig();
      config.setMaxLineLen(-1);
      config.setMaxContentLen(-1);
      Parser parser = new RFC822Parser(config);

      1. TIKA-640.patch
        3 kB
        Benjamin Douglas

        Activity

          People

          • Assignee:
            Jukka Zitting
            Reporter:
            Jens Wilmer
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - 5m
              5m
              Remaining:
              Remaining Estimate - 5m
              5m
              Logged:
              Time Spent - Not Specified
              Not Specified

                Development