Bug 45804 - [PATCH] hsmf -- MAPIMessage does not work with Outlook 3.0 .msg files
Summary: [PATCH] hsmf -- MAPIMessage does not work with Outlook 3.0 .msg files
Status: RESOLVED FIXED
Alias: None
Product: POI
Classification: Unclassified
Component: POI Overall (show other bugs)
Version: unspecified
Hardware: PC Windows XP
: P2 blocker (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-09-14 14:54 UTC by Randall Scarberry
Modified: 2018-09-10 05:27 UTC (History)
1 user (show)



Attachments
Changes to the StringChunk class (878 bytes, patch)
2008-09-14 14:54 UTC, Randall Scarberry
Details | Diff
An example Outlook 3.0 .msg file (95.00 KB, application/octet-stream)
2008-09-15 07:17 UTC, Randall Scarberry
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Randall Scarberry 2008-09-14 14:54:09 UTC
Created attachment 22563 [details]
Changes to the StringChunk class

This applies to the hsmf component in the scratchpad area.

This might be the same bug as 45048.  Instantiating a MAPIMessage object on an Outlook 3.0 .msg file does not permit access to any of the items such as the subject or the content.  For example, calling getSubject() triggers a ChunkNotFoundException. 

The simple solution was to add a new constructor to StringChunk which lets you specify both the chunkId and the type.  Previously, the type defaulted to Types.STRING, which doesn't match any of the string items in my .msg files.

With the new constructor, I can retrieve both the message header and content as follows:

  MAPIMessage msg = new MAPIMessage("test.msg");

  String header = msg.getStringFromChunk(new StringChunk(0x007D, 0x001F));
  String content = msg.getStringFromChunk(StringChunk(0x1000, 0x001F));
Comment 1 Nick Burch 2008-09-15 04:15:55 UTC
Thanks for this patch.

Any chance you could upload a sample file that triggers this problem, so we can add a test, and we can also investigate if we should tweak the main class too
Comment 2 Randall Scarberry 2008-09-15 07:17:09 UTC
Created attachment 22571 [details]
An example Outlook 3.0 .msg file

This is an email file I extracted from Outlook 3.0 by copying it, then pasting it into a directory in Windows XP explorer.
Comment 3 Randall Scarberry 2008-09-15 07:51:04 UTC
For the attached .msg file, all of the string items in the message have labels of the form: "__substg1.0_0078001F".  The last 8 hex digits are comprised of the chunkId and the type.  For test.msg, the type is always 0x001F. The chunkId varies depending on whether it's the subject, text body, from, to etc..  

But the StringChunk class, without my changes, always uses 0x001E as the type, so the methods of MAPIMessage always throw a ChunkNotFoundException.  The message items are stored in a hash map in an object of POIFSChunkParser.  These items are supposed to be retrieved using the appropriate StringChunks as keys.  The new constructor for StringChunk lets me create appropriate StringChunks for my .msg files.

I could've fixed the problem by redefining Types.STRING as 0x001F in org.apache.poi.hsmf.datatypes.Types.  But I figured that 0x001E probably works with whatever .msg files this was first developed for.

Maybe the lib could detect the version of the .msg file and define the StringChunk type appropriately as either 0x001E or 0x001F.

 
Comment 4 Nick Burch 2008-09-15 14:52:34 UTC
Thanks for the patch, test file and investigations

In the end, I've got HSMF working with both the old and the new style outlook files. Hopefully this'll work even better for you now! :)

In general though, HSMF isn't being actively developed right now (Travis isn't around ATM), so any further patches for HSMF are greatfully received!