Bug 43376 - Macro code lost after edit a ppt file
Summary: Macro code lost after edit a ppt file
Status: RESOLVED WONTFIX
Alias: None
Product: POI
Classification: Unclassified
Component: HSLF (show other bugs)
Version: 3.0-FINAL
Hardware: PC Windows XP
: P1 critical (vote)
Target Milestone: ---
Assignee: POI Developers List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-09-13 01:47 UTC by hearace
Modified: 2015-10-26 21:32 UTC (History)
0 users



Attachments
this is the template file (18.00 KB, application/vnd.ms-powerpoint)
2007-09-13 18:19 UTC, hearace
Details
here is the result file Generated by java (16.50 KB, application/vnd.ms-powerpoint)
2007-09-13 18:21 UTC, hearace
Details

Note You need to log in before you can comment on or make changes to this bug.
Description hearace 2007-09-13 01:47:16 UTC
I create a ppt file with some macro code.  Then I edit the ppt file by a java 
application. The java code is below:
public class Test {
	public static void main(String[] args){
		System.out.println("Start");
		String data = "test abc";
		try {
			FileInputStream fis = new FileInputStream
("d:\\test\\temp.ppt");
			HSLFSlideShow hss = new HSLFSlideShow(fis);
			SlideShow ss = new SlideShow(hss);
			Slide slide = ss.getSlides()[0];
			TextBox shape = new TextBox();
			shape.setText(data);
			shape.setAnchor(new java.awt.Rectangle(50, 50, 500, 
300));  
//			slide.addShape(shape);

			FileOutputStream fos = new FileOutputStream
("d:\\test\\result.ppt");
			hss.write(fos);
			//ss.write(fos);
		} catch (FileNotFoundException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		} catch (IOException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		}
		
		System.out.println("End");
	}
}

When I try to run the macro in the new ppt file "result.ppt", the error "File 
not found" occured.

My Enviroument likes below:
OS: Windows xp Simple Chinese Edition
JDK: 1.4.2
poi: 3.0.1-FINAL-20070705
Office: ppt file is created by office xp & 2003 (I tried two ppt file create 
by different office edition)
Comment 1 Nick Burch 2007-09-13 02:34:25 UTC
It's possible the macros are stored in another stream, and we're not writing it
back out again

Could you upload a sample file, so we can take a look?
Comment 2 hearace 2007-09-13 18:19:41 UTC
Created attachment 20822 [details]
this is the template file
Comment 3 hearace 2007-09-13 18:21:59 UTC
Created attachment 20823 [details]
here is the result file Generated by java
Comment 4 hearace 2007-09-17 03:27:34 UTC
(In reply to comment #1)
> It's possible the macros are stored in another stream, and we're not writing 
it
> back out again
> Could you upload a sample file, so we can take a look?

Hi, Nick Burch

Could you please give some explanation about the problem such as why this 
happen etc? Could you please tell me the target day to fix this defect? 
Because it's very important for my project.

Thank you!
Comment 5 Nick Burch 2007-09-17 09:33:22 UTC
On closer inspection, it appears that the macros aren't in a different stream.
So, it isn't a case of us loosing OLE2 streams - all the OLE2 streams in the
source document are ending up in the final one. (That said, it is now possible
to ask HSLF to preserve all OLE2 nodes, much as you have been able to do with HSSF)

However, the PowerPoint stream is shrinking in size, so some data is getting
lost. Your best bet is to compare the two files using the
org.apache.poi.hslf.dev tools, and try to spot where your data is going missing.
My guess is that we're not re-serialising one of the key records properly.

If you can figure out exactly where the data loss is occuring, then hopefully we
can fix it
Comment 6 Yegor Kozlov 2007-09-17 11:54:21 UTC
How interesting!

I was sure it's a lost POIFS entry. In Word and Excel macros do live
in POIFS! It turns out PPT is a different case (:.
 
Here is what I've found:
In PPT macros are stored in ExOleObjStg root-level records. I changed the macro
code, saved the ppt and compared it with the original. 
The only difference was in ExOleObjStg. Thanks to Trejkaz, now we have accessors
to this data.

Here is a test code to dump ExOleObjStg in filesystem:

        FileInputStream fis = new FileInputStream("temp.ppt");
        HSLFSlideShow hss = new HSLFSlideShow(fis);

        ObjectData[] obj = hss.getEmbeddedObjects();
        for (int i = 0; i < obj.length; i++) {
            FileOutputStream out = new FileOutputStream("objdata-" + (i+1) +
".dat");
            InputStream is = obj[i].getData();
            byte[] chunk = new byte[16184];
            int count;
            while ((count = is.read(chunk)) >=0 ) {
              out.write(chunk,0,count);
            }
            out.close();
        }

Look at it in a text editor and see that "it is about macros" :).
It looks like this data has a reference (offset) to the owning slide. If I don't
change anything, just re-save the ppt then the macros are there.
If I add a shape the macros are lost. So the task is to find this place
and update the link.

I don't know how to parse this abracadabra. Any ideas?
 AFAIK, VBA macros are stored as "p-code + s-code" where p-code is compiled VBA
code and s-code is the source. That is ExOleObjStg contains both compiled and
source VBA code. 

Useful links are 
http://www.virushelpmunic.de/konferenz/1999/makroviren/ (In German)
http://www.uinc.ru/articles/46/ (In Russian)

They are about Word and Excel and I hope these ideas are applicable to PPT.


Regards,
Yegor
Comment 7 hearace 2007-09-28 00:20:21 UTC
Can this problem be fixed?
Comment 8 Yegor Kozlov 2007-09-28 00:40:11 UTC
Unfortunately not. At least not in foreseeable future. 
We know where PowerPoint stores macro code and what causes it to be lost after
edit. To fix it we need to decode ExOleObjStg and it is a big task. If you want
to be a volunteer - you are very much welcome. I can explain basic principles of
hacking ppt format and help you.


Regards,
Yegor
Comment 9 Dominik Stadler 2015-10-26 21:32:30 UTC
This is unsolved and unresponded for a long time, therefore closing this as WONTFIX for now. If you have plans to work on this, please discuss on the dev-mailing list first.