It is extremely important that the size of the decrypted data stream exactly matches that specified in the input stream. Failure to ensure this means that windows System.IO.Packaging.Package.Open() method will return an error when trying to open the decrypted document. In practice this means that MS Office will report that the file is corrupt Just reading to the end of the input stream is not sufficient because there are normally padding bytes that must be discarded Currently Decryptor and/or its subclasses read and *discard* the required length In order to be able to create valid documents from encrypted ones, this length must be made available so the output stream can be truncated I wlll submit my proposed patch when I have sown time to create it in the required format
I'm not quite clear on what you're trying to do, where the problem comes in, and why you're talking about .net APIs? Could you maybe provide some more detail (or even better a unit test) that explains what goes wrong, where and why?
Sorry, Buzilla and/or my browser lost my original description of the problem The issue is that the classic POI 'myDecrypt' method below in general produces output files that are considered corrupt by Micrososft Office If say you use MS Word to open a .docx file decrypted by this method you will get an error dialog saying the file is corrupt and asking if you would like attempt recovery (which incidentally will succeed) The problem is that the output length is too long. Output must be truncated to the length specified in the input data stream and currently discarded for example in "EcmaDecryptor>>getDataStream(DirectoryNode dir)" This is the offending line 128: long size = dis.readLong(); The solution is to save this length in an instance variable of the class Decryptor so that it is accessible to user written code (Decryptor>>getLength()). Then myDecrypt method can then be modified to work correctly. ===== private void myDecrypt(String filename, String password) throws FileNotFoundException, IOException { File inFile = new File(filename); File outFile = new File(new File(filename).getParentFile(), "Decrypted" + new File(filename).getName()); System.err.println("Attempting to decrypt " + inFile.getAbsolutePath() + " to " + outFile.getAbsolutePath()); POIFSFileSystem filesystem = new POIFSFileSystem(new FileInputStream(inFile)); EncryptionInfo info = new EncryptionInfo(filesystem); Decryptor d = Decryptor.getInstance(info); try { if (!d.verifyPassword(password)) { throw new RuntimeException("Unable to process: wrong password"); } InputStream dataStream = d.getDataStream(filesystem); OutputStream out = new FileOutputStream(outFile); byte buf[] = new byte[1024]; int len; while ((len = dataStream.read(buf)) > 0) out.write(buf, 0, len); out.close(); dataStream.close(); } catch (GeneralSecurityException ex) { throw new RuntimeException("Unable to process encrypted document", ex); } System.err.println("Finished " + inFile.getAbsolutePath()); }
Please attach a decrypted file so that we can test your code sample. Yegor (In reply to comment #3) > Sorry, Buzilla and/or my browser lost my original description of the problem > > The issue is that the classic POI 'myDecrypt' method below in general produces > output files that are considered corrupt by Micrososft Office > > If say you use MS Word to open a .docx file decrypted by this method you will > get an error dialog saying the file is corrupt and asking if you would like > attempt recovery (which incidentally will succeed) > > The problem is that the output length is too long. Output must be truncated to > the length specified in the input data stream and currently discarded for > example in "EcmaDecryptor>>getDataStream(DirectoryNode dir)" > > This is the offending line > > 128: long size = dis.readLong(); > > The solution is to save this length in an instance variable of the class > Decryptor so that it is accessible to user written code > (Decryptor>>getLength()). Then myDecrypt method can then be modified to work > correctly. > ===== > private void myDecrypt(String filename, String password) throws > FileNotFoundException, IOException { > File inFile = new File(filename); > File outFile = new File(new File(filename).getParentFile(), "Decrypted" > + new File(filename).getName()); > > System.err.println("Attempting to decrypt " + inFile.getAbsolutePath() > + " to " + outFile.getAbsolutePath()); > > POIFSFileSystem filesystem = new POIFSFileSystem(new > FileInputStream(inFile)); > EncryptionInfo info = new EncryptionInfo(filesystem); > Decryptor d = Decryptor.getInstance(info); > > try { > if (!d.verifyPassword(password)) { > throw new RuntimeException("Unable to process: wrong > password"); > } > > InputStream dataStream = d.getDataStream(filesystem); > > OutputStream out = new FileOutputStream(outFile); > byte buf[] = new byte[1024]; > int len; > while ((len = dataStream.read(buf)) > 0) > out.write(buf, 0, len); > out.close(); > dataStream.close(); > > } catch (GeneralSecurityException ex) { > throw new RuntimeException("Unable to process encrypted document", > ex); > } > System.err.println("Finished " + inFile.getAbsolutePath()); > }
As of r1293784, POI provides Decryptor#getLength() that returns length of the decrypted data stream. The getLength() method must be called after Decryptor.getDataStream() where the length variable is initialized. An attempt to call getLength() prior to getDataStream() will result in IllegalStateException. Regards, Yegor