Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-4153

Update RFC822 detection, again

    XMLWordPrintableJSON

Details

    • Task
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • None
    • 2.9.1, 3.0.0-BETA
    • None
    • None

    Description

      On the user list, Kashif Khan supplied the following example of a file that does not start with RFC822 fields, is detected as RFC822...leading to loss of text.

      Some text here 1.
      Some text here 2.
      Some text here 3.
      Original Message----
      From: some_mail@abc.com <some_mail@abc.com>
      Sent: Thursday, October 31, 2019 9:52 AM
      To: Some person, (The XYZ group)
      Subject: RE: Mr. Random person phone call: MESSAGE
      Hi,
      I am available now to receive the call.
      Some text here 4.
      Some text here 5.**Some text here 6.
      

      From what I can tell, we last modified the rfc822 detection on TIKA-4125, with the major minShouldMatch refactoring on TIKA-3153.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              tallison Tim Allison
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: