Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-1971

Email saved as .eml with no body not detected as rfc822, while same email saved as plain txt is.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 1.14
    • 1.14, 2.0.0
    • detector
    • None
    • Debian Jessie
      Java(TM) SE Runtime Environment (build 1.8.0_91-b14)
      Mac OSX Mail

    Description

      I save an email with no body text

      (1) by dragging it from Mac Mail so that an .eml file is created
      (2) by using 'Save As' in Mac Mail so that a .txt file is created

      I then feed the files to Tika Server with the following command

      curl -T filename http://localhost:9998/detect/stream

      In case (1) the response is text/plain, while in case (2) the response is message/rfc822. This is strange, since (1) includes the full raw header, while (2) only includes a very abbreviated header.

      Attachments

        1. Testemail-empty-works.txt
          0.2 kB
          Philipp Steinkrueger
        2. Testemail-empty-doesnotwork.eml
          2 kB
          Philipp Steinkrueger

        Activity

          People

            Unassigned Unassigned
            philipp.steinkrueger@uni-koeln.de Philipp Steinkrueger
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: