Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-4051

Different DestOutputProfiles in OutputIntentArray after PDFMergerUtility.Merge leads to non-conformity

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Not A Bug
    • 2.0.6
    • None
    • Utilities
    • None

    Description

      Hi ... not sure if thats a bug or not, so i just shoot:
      Im merging some pdfs that conform to the pdfa-1b standard with PdfMergerUtility.merge.
      The result has to be pdfa-1b conform as well.
      So my code is like

      public ByteArrayOutputStream merge(final List<InputStream> sources) throws IOException {
              try (
                  ByteArrayOutputStream mergedPDFOutputStream = new ByteArrayOutputStream();
                  COSStream cosStream = new COSStream()
              ) {
                  PDFMergerUtility pdfMerger = createPDFMergerUtility(sources, mergedPDFOutputStream);
                  PDDocumentInformation pdfDocumentInfo = createPDFDocumentInfo(TITLE, CREATOR, SUBJECT);
                  PDMetadata xmpMetadata = createXMPMetadata(cosStream, TITLE, CREATOR, SUBJECT);
                  pdfMerger.setDestinationDocumentInformation(pdfDocumentInfo);
                  pdfMerger.setDestinationMetadata(xmpMetadata);
                  pdfMerger.mergeDocuments(MemoryUsageSetting.setupMainMemoryOnly());          
                  return mergedPDFOutputStream;
              } catch (BadFieldValueException | TransformerException e) {
                  throw new IOException("PDF merge problem", e);
              } finally {
                  for (InputStream source : sources) {
                      try {
                          source.close();
                      } catch (IOException e) {}
                  }
              }
      

      This works fine if the pdfs come from the same source, e.g. have similar OutputIntents described in their catalogs.
      But when i mix documents that have different OutputIntents like

      /OutputIntents
      [
      <<
      /Type /OutputIntent
      /S /GTS_PDFA1
      /OutputConditionIdentifier (sRGB)
      /RegistryName (http://www.color.org)
      /DestOutputProfile 36 0 R
      >>
      ]
      

      and

      <</OutputIntents[<</Info(Adobe RGB \(1998\))/S/GTS_PDFA1/Type/OutputIntent/DestOutputProfile 1 0 R/OutputConditionIdentifier(Adobe RGB \(1998\))>>]/Metadata 16 0 R/Type/Catalog/StructTreeRoot 15 0 R/MarkInfo<</Marked true>>/Pages 4 0 R>>
      

      the PdfMergerUtility seems to concat them with different DestOutputProfiles:

      4 0 obj
      <<
      /Type /OutputIntent
      /S /GTS_PDFA1
      /OutputConditionIdentifier (sRGB)
      /RegistryName (http://www.color.org)
      /DestOutputProfile 17 0 R
      >>
      endobj
      5 0 obj
      <<
      /Info (Adobe RGB \(1998\))
      /S /GTS_PDFA1
      /Type /OutputIntent
      /DestOutputProfile 18 0 R
      /OutputConditionIdentifier (Adobe RGB \(1998\))
      >>
      endobj
      

      and therefore its no longer conform thanks to Specification: ISO 19005-1:2005, Clause: 6.2.2

      when i manually change the file (with a notepad) from "/DestOutputProfile 18 0 R" to "/DestOutputProfile 17 0 R" the file again gains conformity.

      I might be able to re-parse the merged document as PDDocument and modify the OutputIntent-Array in the Catalog, but i dont think thats how it was intended?

      So am i doing something wrong or should PdfMergerUtility not only get a setter for DocumentInformation and MetaData, but some manual way to influence the Outputintents?

      Thanks already.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              DrZoidberg Joerg Neumann
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: