Bug 44694

Summary: Unable to retrieve SummaryInformation from some Office docs; NoPropertySetStreamException thrown
Product: POI Reporter: Dmitry Goldenberg <dgoldenberg>
Component: HPSFAssignee: POI Developers List <dev>
Status: RESOLVED FIXED    
Severity: critical    
Priority: P2    
Version: 3.0-FINAL   
Target Milestone: ---   
Hardware: PC   
OS: Windows XP   

Description Dmitry Goldenberg 2008-03-27 11:26:33 UTC
I get this on some Office docs, a specific PPT doc to reproduce this is attached.

What I do is this:

POIFSFileSystem filesystem = new POIFSFileSystem(fis);
SummaryInformation si = (SummaryInformation) getPropertySet(filesystem, SummaryInformation.DEFAULT_STREAM_NAME, sourcePath);

where getPropertySet does this:

public static PropertySet getPropertySet(POIFSFileSystem filesystem, String setName, String filepath) throws IOException {
    DocumentInputStream dis = filesystem.createDocumentInputStream(setName);
    return PropertySetFactory.create(dis);
....

This causes the following exception:

org.apache.poi.hpsf.NoPropertySetStreamException
	at org.apache.poi.hpsf.PropertySet.<init>(PropertySet.java:252)
	at org.apache.poi.hpsf.PropertySetFactory.create(PropertySetFactory.java:61)

The file doesn't appear corrupt as it opens in PPT just fine.  Also, I dumped out the directory nodes and I see that summary info is in fact there.
Comment 1 Rainer Klute 2008-03-27 23:26:12 UTC
I cannot look into this issue without a document exhibiting the faulty behaviour.
Comment 2 Rainer Klute 2008-04-01 17:50:01 UTC
HPSF now supports property sets without any section. However, such a property set cannot be a summary information or document summary information, because the latters' identification is expected to be in the first section.