I create a ppt file with some macro code. Then I edit the ppt file by a java application. The java code is below: public class Test { public static void main(String[] args){ System.out.println("Start"); String data = "test abc"; try { FileInputStream fis = new FileInputStream ("d:\\test\\temp.ppt"); HSLFSlideShow hss = new HSLFSlideShow(fis); SlideShow ss = new SlideShow(hss); Slide slide = ss.getSlides()[0]; TextBox shape = new TextBox(); shape.setText(data); shape.setAnchor(new java.awt.Rectangle(50, 50, 500, 300)); // slide.addShape(shape); FileOutputStream fos = new FileOutputStream ("d:\\test\\result.ppt"); hss.write(fos); //ss.write(fos); } catch (FileNotFoundException e) { // TODO Auto-generated catch block e.printStackTrace(); } catch (IOException e) { // TODO Auto-generated catch block e.printStackTrace(); } System.out.println("End"); } } When I try to run the macro in the new ppt file "result.ppt", the error "File not found" occured. My Enviroument likes below: OS: Windows xp Simple Chinese Edition JDK: 1.4.2 poi: 3.0.1-FINAL-20070705 Office: ppt file is created by office xp & 2003 (I tried two ppt file create by different office edition)
It's possible the macros are stored in another stream, and we're not writing it back out again Could you upload a sample file, so we can take a look?
Created attachment 20822 [details] this is the template file
Created attachment 20823 [details] here is the result file Generated by java
(In reply to comment #1) > It's possible the macros are stored in another stream, and we're not writing it > back out again > Could you upload a sample file, so we can take a look? Hi, Nick Burch Could you please give some explanation about the problem such as why this happen etc? Could you please tell me the target day to fix this defect? Because it's very important for my project. Thank you!
On closer inspection, it appears that the macros aren't in a different stream. So, it isn't a case of us loosing OLE2 streams - all the OLE2 streams in the source document are ending up in the final one. (That said, it is now possible to ask HSLF to preserve all OLE2 nodes, much as you have been able to do with HSSF) However, the PowerPoint stream is shrinking in size, so some data is getting lost. Your best bet is to compare the two files using the org.apache.poi.hslf.dev tools, and try to spot where your data is going missing. My guess is that we're not re-serialising one of the key records properly. If you can figure out exactly where the data loss is occuring, then hopefully we can fix it
How interesting! I was sure it's a lost POIFS entry. In Word and Excel macros do live in POIFS! It turns out PPT is a different case (:. Here is what I've found: In PPT macros are stored in ExOleObjStg root-level records. I changed the macro code, saved the ppt and compared it with the original. The only difference was in ExOleObjStg. Thanks to Trejkaz, now we have accessors to this data. Here is a test code to dump ExOleObjStg in filesystem: FileInputStream fis = new FileInputStream("temp.ppt"); HSLFSlideShow hss = new HSLFSlideShow(fis); ObjectData[] obj = hss.getEmbeddedObjects(); for (int i = 0; i < obj.length; i++) { FileOutputStream out = new FileOutputStream("objdata-" + (i+1) + ".dat"); InputStream is = obj[i].getData(); byte[] chunk = new byte[16184]; int count; while ((count = is.read(chunk)) >=0 ) { out.write(chunk,0,count); } out.close(); } Look at it in a text editor and see that "it is about macros" :). It looks like this data has a reference (offset) to the owning slide. If I don't change anything, just re-save the ppt then the macros are there. If I add a shape the macros are lost. So the task is to find this place and update the link. I don't know how to parse this abracadabra. Any ideas? AFAIK, VBA macros are stored as "p-code + s-code" where p-code is compiled VBA code and s-code is the source. That is ExOleObjStg contains both compiled and source VBA code. Useful links are http://www.virushelpmunic.de/konferenz/1999/makroviren/ (In German) http://www.uinc.ru/articles/46/ (In Russian) They are about Word and Excel and I hope these ideas are applicable to PPT. Regards, Yegor
Can this problem be fixed?
Unfortunately not. At least not in foreseeable future. We know where PowerPoint stores macro code and what causes it to be lost after edit. To fix it we need to decode ExOleObjStg and it is a big task. If you want to be a volunteer - you are very much welcome. I can explain basic principles of hacking ppt format and help you. Regards, Yegor
This is unsolved and unresponded for a long time, therefore closing this as WONTFIX for now. If you have plans to work on this, please discuss on the dev-mailing list first.