Description
[imported from SourceForge]
http://sourceforge.net/tracker/index.php?group_id=78314&atid=552832&aid=1323753
Originally submitted by loppermann on 2005-10-11 04:27.
When a PDF including an index-color image is encrypted,
the colors get mixed up. The pallette seems to remain
intact, but the colors of the pixels are wrong.
Test code: (bitmaptest.pdf is attached)
package pdfboxtest;
import org.pdfbox.pdmodel.PDDocument;
import org.pdfbox.pdmodel.encryption.*;
import java.io.*;
public class Main {
public static void main(String[] args) {
try
catch (Exception e)
{ e.printStackTrace(); } }
}
[attachment on SourceForge]
http://sourceforge.net/tracker/download.php?group_id=78314&atid=552832&aid=1323753&file_id=152105
bitmaptest.pdf (application/pdf), 14049 bytes
pdf containg an indexed color image
[attachment on SourceForge]
http://sourceforge.net/tracker/download.php?group_id=78314&atid=552832&aid=1323753&file_id=152106
result.pdf (application/pdf), 13927 bytes
result.pdf file with messed up colors
[attachment on SourceForge]
http://sourceforge.net/tracker/download.php?group_id=78314&atid=552832&aid=1323753&file_id=152108
Main.java (application/octet-stream), 743 bytes
test case source
[comment on SourceForge]
Originally sent by loppermann.
Logged In: YES
user_id=1359988
I haven't found anything about exceptions for strings in the
spec. On the otherhand, acrobar reader as well as pdfbox
clearly seem not to decrypt the palette string.
I am not an expert at cryptography, but I imagine, that this
could have the following reason:
Since palettes are often known (windows standard palette,
web-palette etc.) encrypting them, might give an attacker a
reasonable guess at what the cleartext of the encrypted
string should look like. Hence, if he was able to use a
clear-text attack against the palette string he could use
the extracted key for all other streams in the file.
Since the a palette does not include very sensitive
information by itself, it might be a more secure choice to
not encrypt it at all, in order to protect from such
clear-text attacks.
[comment on SourceForge]
Originally sent by benlitchfield.
Logged In: YES
user_id=601708
strings should be encrypted, but there are some exceptions.
thanks for looking further into this for me. I will check with
the PDF reference to see if it says anything. the exceptions
are usually security related so I would be surprised if it was
not suppose to encrypted, there it might be something related.
Ben
[comment on SourceForge]
Originally sent by loppermann.
Logged In: YES
user_id=1359988
ok, if all strings are not encrypted, Docuemnt information
will get messed up, i.e. it will be encrypted during
decryption
[comment on SourceForge]
Originally sent by loppermann.
Logged In: YES
user_id=1359988
my observation of the palatte staying intact seems to be
wrong. after some more testing and debugging, I see, that
the palette seems to get encrypted, which I think it should
not be. Due to the symmetric nature of RC4, encrypting,
decrypting and encrypting again will yield the right
palette. So even PDFBox does not try to decypt the palette
string it has encrypted when opening and decrypting the PDF.
When I disable the encryption of Strings in the PDFWriter,
everything seems to work fine. I haven't found this in the
spec yet, with which I'm not so familiar. I will keep
looking though.
So if in fact strings should not be encrypted (and the fact
that they are not decrypted seems to imply this), the fix is
simply to take out the encryption of strings in PDFWriter.
[comment on SourceForge]
Originally sent by loppermann.
Logged In: YES
user_id=1359988
attached test case
[comment on SourceForge]
Originally sent by loppermann.
Logged In: YES
user_id=1359988
attached result.pdf file with messed up colors