Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-5753

multipdf.Splitter - Changes color of images in splitted pages

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • 3.0.1 PDFBox
    • None
    • None
    • None
    • macOS Sonoma 14.2.1

    Description

      When using the default org.apache.pdfbox.multipdf.Splitter the color of the splitted pages gets changed.
      That is a problem

      The attachment contains an example. Here one will see that the yellow tone on page 10 is different than the one who got extracted from the original document. See original document.pdf vs page 10.pdf.

      The Java code which has been used to extract the pages is:

      try (final PDDocument document = Loader.loadPDF(new File(inputPath))) {
          Splitter splitter = new Splitter();
          List<PDDocument> pages = splitter.split(document);
          for (int i = 0; i < pages.size(); i++) {
              final String fileName = OUT_PREFIX + (i + 1) + OUT_SUFFIX;
              final File file = new File(outputPath, fileName);
              pages.get(i).save(file);
          }
      }
      

      The pom:

      <dependency>
          <groupId>org.apache.pdfbox</groupId>
          <artifactId>pdfbox</artifactId>
          <version>3.0.1</version>
      </dependency>
      

      Also I am using Java 21.

      Attachments

        1. original document.pdf
          3.57 MB
          Marcus Korinth
        2. page 10.pdf
          160 kB
          Marcus Korinth
        3. split-with-snapshot-p10.pdf
          160 kB
          Tilman Hausherr

        Issue Links

          Activity

            People

              Unassigned Unassigned
              mko91 Marcus Korinth
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: