Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-2128

CMYK images are not supported correctly

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.8.5, 1.8.6, 2.0.0
    • Fix Version/s: 2.0.0
    • Component/s: PDModel
    • Labels:
    • Environment:
      Windows 7 Professional
      Running jvm: Java HotSpot(TM) 64-Bit Server VM - 1.6.0_26-b03 - 20.1-b02 - Sun Microsystems Inc

      Description

      I have a PDF with CMYK images inside and i need to extract the images in the RGB format. But the PDJpeg class seems to not work correctly; the colors are bad. Example:

      You can download the PDF : http://ludoda.free.fr/PORSCHE_CMYK.PDF

      and try my simple Test Case (I'm using PDFbox 1.8.5):

      import java.awt.image.BufferedImage;
      import java.io.File;
      import java.io.IOException;
      import java.util.Iterator;
      import java.util.List;
      import java.util.Map;
      
      import javax.imageio.ImageIO;
      
      import org.apache.pdfbox.pdmodel.PDDocument;
      import org.apache.pdfbox.pdmodel.PDPage;
      import org.apache.pdfbox.pdmodel.PDResources;
      import org.apache.pdfbox.pdmodel.graphics.xobject.PDJpeg;
      import org.apache.pdfbox.pdmodel.graphics.xobject.PDXObject;
      import org.apache.pdfbox.pdmodel.graphics.xobject.PDXObjectImage;
      
      public class TestCase {
      	
      	public static void main(String[] args) 
      	{
      		try 
      		{
      			System.out.println("START EXTRACTING IMAGES...");
      			read_pdf();
      			System.out.println("COMPLETE");
      		}
      		catch (IOException ex) 
      		{
      		    System.out.println("" + ex);
      		}
      
      	}
      
      	public static void read_pdf() throws IOException 
      	{
      		    PDDocument document = null; 
      		    document = PDDocument.load("C:\\temp\\PORSCHE_CMYK.pdf");
      
      		    @SuppressWarnings("unchecked")
      		    List<PDPage> pages = document.getDocumentCatalog().getAllPages();
      		    Iterator<PDPage> iter = pages.iterator(); 
      		    int i =1;
      
      		    while (iter.hasNext())
      		    {
      		        PDPage page = (PDPage) iter.next();
      		        PDResources resources = page.getResources();
      		        Map<String, PDXObject> pageImages = resources.getXObjects();
      		        if (pageImages != null)
      		        { 
      		            Iterator<String> imageIter = pageImages.keySet().iterator();
      		            while (imageIter.hasNext())
      		            {
      		            	String key = (String) imageIter.next();
      		            	if(pageImages.get(key) instanceof PDXObjectImage)
      		                {
      		                	PDJpeg image = (PDJpeg) pageImages.get(key);
      		                	
      		                	// Test 1 : write2file
      		                	image.write2file("C:\\workspace\\JAVA_PDFTools\\temp\\image" + i);
      		                	
      		                	// Test 2: getRGBImage
      		                	BufferedImage bimage=image.getRGBImage();
      		                	File outputfile = new File("C:\\workspace\\JAVA_PDFTools\\temp\\image" + i+"_buffered.jpg");
      		                	ImageIO.write(bimage, "jpg", outputfile);
      		                	i ++;
      		                }
      		            }
      		        }
      		    }
      		}
      }
      

        Attachments

        1. 573636.pdf
          1.72 MB
          Tilman Hausherr
        2. DCTFilter.patch
          3 kB
          Tilman Hausherr
        3. PDFBOX-2128-CAPITAL.pdf-1.png
          238 kB
          Tilman Hausherr
        4. PDFBOX-2128-CAPITAL.pdf-1.png
          241 kB
          Tilman Hausherr
        5. PDFBOX-2128-CAPITAL.pdf-2.png
          270 kB
          Tilman Hausherr
        6. PDFBOX-2128-PORSCHE_CMYK.pdf-1.png
          32 kB
          Tilman Hausherr
        7. PDFBOX-2128-PORSCHE_CMYK.pdf-2.png
          1.07 MB
          Tilman Hausherr
        8. porsche_cmyk.pdf-2.png
          924 kB
          Tilman Hausherr

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                ldavoine Ludovic Davoine
              • Votes:
                2 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - 1h
                  1h
                  Remaining:
                  Remaining Estimate - 1h
                  1h
                  Logged:
                  Time Spent - Not Specified
                  Not Specified