Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-99

Indexed color images have wrong colors after encryption

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 1.3.1
    • PDModel
    • None

    Description

      [imported from SourceForge]
      http://sourceforge.net/tracker/index.php?group_id=78314&atid=552832&aid=1323753
      Originally submitted by loppermann on 2005-10-11 04:27.

      When a PDF including an index-color image is encrypted,
      the colors get mixed up. The pallette seems to remain
      intact, but the colors of the pixels are wrong.

      Test code: (bitmaptest.pdf is attached)

      package pdfboxtest;
      import org.pdfbox.pdmodel.PDDocument;
      import org.pdfbox.pdmodel.encryption.*;
      import java.io.*;

      public class Main {
      public static void main(String[] args) {
      try

      { PDStandardEncryption enc = new PDStandardEncryption(); enc.setVersion(PDEncryptionDictionary.VERSION2_VARIABLE_LENGTH_ALGORITHM); enc.setRevision(PDStandardEncryption.REVISION3); enc.setLength(128); PDDocument doc = PDDocument.load("bitmaptest.pdf"); doc.setEncryptionDictionary(enc); doc.encrypt("owner", ""); doc.save("result.pdf"); doc.close(); }

      catch (Exception e)

      { e.printStackTrace(); }

      }
      }

      [attachment on SourceForge]
      http://sourceforge.net/tracker/download.php?group_id=78314&atid=552832&aid=1323753&file_id=152105
      bitmaptest.pdf (application/pdf), 14049 bytes
      pdf containg an indexed color image

      [attachment on SourceForge]
      http://sourceforge.net/tracker/download.php?group_id=78314&atid=552832&aid=1323753&file_id=152106
      result.pdf (application/pdf), 13927 bytes
      result.pdf file with messed up colors

      [attachment on SourceForge]
      http://sourceforge.net/tracker/download.php?group_id=78314&atid=552832&aid=1323753&file_id=152108
      Main.java (application/octet-stream), 743 bytes
      test case source

      [comment on SourceForge]
      Originally sent by loppermann.
      Logged In: YES
      user_id=1359988

      I haven't found anything about exceptions for strings in the
      spec. On the otherhand, acrobar reader as well as pdfbox
      clearly seem not to decrypt the palette string.
      I am not an expert at cryptography, but I imagine, that this
      could have the following reason:
      Since palettes are often known (windows standard palette,
      web-palette etc.) encrypting them, might give an attacker a
      reasonable guess at what the cleartext of the encrypted
      string should look like. Hence, if he was able to use a
      clear-text attack against the palette string he could use
      the extracted key for all other streams in the file.
      Since the a palette does not include very sensitive
      information by itself, it might be a more secure choice to
      not encrypt it at all, in order to protect from such
      clear-text attacks.

      [comment on SourceForge]
      Originally sent by benlitchfield.
      Logged In: YES
      user_id=601708

      strings should be encrypted, but there are some exceptions.
      thanks for looking further into this for me. I will check with
      the PDF reference to see if it says anything. the exceptions
      are usually security related so I would be surprised if it was
      not suppose to encrypted, there it might be something related.

      Ben

      [comment on SourceForge]
      Originally sent by loppermann.
      Logged In: YES
      user_id=1359988

      ok, if all strings are not encrypted, Docuemnt information
      will get messed up, i.e. it will be encrypted during
      decryption

      [comment on SourceForge]
      Originally sent by loppermann.
      Logged In: YES
      user_id=1359988

      my observation of the palatte staying intact seems to be
      wrong. after some more testing and debugging, I see, that
      the palette seems to get encrypted, which I think it should
      not be. Due to the symmetric nature of RC4, encrypting,
      decrypting and encrypting again will yield the right
      palette. So even PDFBox does not try to decypt the palette
      string it has encrypted when opening and decrypting the PDF.
      When I disable the encryption of Strings in the PDFWriter,
      everything seems to work fine. I haven't found this in the
      spec yet, with which I'm not so familiar. I will keep
      looking though.

      So if in fact strings should not be encrypted (and the fact
      that they are not decrypted seems to imply this), the fix is
      simply to take out the encryption of strings in PDFWriter.

      [comment on SourceForge]
      Originally sent by loppermann.
      Logged In: YES
      user_id=1359988

      attached test case

      [comment on SourceForge]
      Originally sent by loppermann.
      Logged In: YES
      user_id=1359988

      attached result.pdf file with messed up colors

      Attachments

        1. bitmaptest.pdf
          14 kB
          Adam Nichols
        2. PDFBOX-99.patch
          2 kB
          Adam Nichols
        3. result.pdf
          14 kB
          Adam Nichols

        Activity

          People

            adamnichols Adam Nichols
            Anonymous Anonymous
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: