Tika
  1. Tika
  2. TIKA-396

Parser Attachements from Outlook Messages

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.6
    • Fix Version/s: 1.0
    • Component/s: parser
    • Labels:
      None
    • Environment:

      All environments.

      Description

      As raised by Albert Jensen on the tika-user mailing list[1], it would be good for the Outlook Parser to iterate through the mails attachments and then extract their content.

      [1]http://mail-archives.apache.org/mod_mbox/lucene-tika-user/201003.mbox/%3C002701cacccf$16108b40$4231a1c0$@mail.dk%3E

        Issue Links

          Activity

          Hide
          Jukka Zitting added a comment -

          Looks like this one is already fixed.

          Show
          Jukka Zitting added a comment - Looks like this one is already fixed.
          Hide
          Jukka Zitting added a comment -

          In revision 933903 I modified the OutlookExtractor to use the parser instance in the ParseContext instead of a hardcoded AutoDetectParser when parsing the attachments. This is similar to what the PackageParser does, and allows better client-level control of the parsing process.

          Note that there's now an extra "Invalid attachment id" line being printed to system out as a part of the tika-parsers test suite. I guess this comes from POI.

          Show
          Jukka Zitting added a comment - In revision 933903 I modified the OutlookExtractor to use the parser instance in the ParseContext instead of a hardcoded AutoDetectParser when parsing the attachments. This is similar to what the PackageParser does, and allows better client-level control of the parsing process. Note that there's now an extra "Invalid attachment id" line being printed to system out as a part of the tika-parsers test suite. I guess this comes from POI.
          Hide
          Dave Meikle added a comment -

          Looking to add a test file but everything I have contains an attachment with private information. Does anyone have anything suitable available? Or do we just need to mock one up?

          Show
          Dave Meikle added a comment - Looking to add a test file but everything I have contains an attachment with private information. Does anyone have anything suitable available? Or do we just need to mock one up?
          Hide
          Dave Meikle added a comment -

          Looks like basic English is escaping me this morning

          Show
          Dave Meikle added a comment - Looks like basic English is escaping me this morning
          Hide
          Dave Meikle added a comment -

          I have a working patch for this but it requires TIKA-395 to fixed to allow the test files I am working with to function correctly

          Show
          Dave Meikle added a comment - I have a working patch for this but it requires TIKA-395 to fixed to allow the test files I am working with to function correctly

            People

            • Assignee:
              Dave Meikle
              Reporter:
              Dave Meikle
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development