Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-3962

Set RFC822 parser to noRecurse

    XMLWordPrintableJSON

Details

    • Task
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • None
    • 2.7.0
    • None
    • None

    Description

      On our test file testGroupWiseEml.eml, there's an embedded rfc822 attachment that is currently not treated as an attachment but is inlined.

      The relevant section of the test file is:

      Content-Type: message/rfc822
      Content-Transfer-Encoding: base64
      Content-Disposition: attachment; filename="test.eml"
      

      When I open the email in several email clients, it shows this test.eml correctly as an attachment.

      It turns out there's a setting on mime4j's parser "setNoRecurse" that yields the correct behavior on this test file. Given that Tika handles files recursively already by default, I think we should be safe to set no recurse in the mime4j parser and rely on Tika's own recursive parsing.

      Attachments

        1. TIKA-2680-1.eml.-2.6.0.json
          4 kB
          Tim Allison
        2. TIKA-2680-1.eml-2.7.0-prerc1.json
          12 kB
          Tim Allison

        Activity

          People

            Unassigned Unassigned
            tallison Tim Allison
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: