Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 1.6.0
    • Fix Version/s: None
    • Component/s: Utilities
    • Labels:
    • Environment:
      Windows 2008 R2

      Description

      Merging two PDFs with form fields results in a PDF with empty fields.
      The issue seems to be similar to https://issues.apache.org/jira/browse/PDFBOX-1031, but in my case i see the fields but not the value in the fields.

      I use the following command to merge the PDFs:
      java -classpath pdfbox-app-1.6.0.jar org.apache.pdfbox.PDFMerger a.pdf b.pdf c.pdf

      1. dereferenceObjectStreams.patch
        0.9 kB
        Ernst Eibensteiner
      2. c.pdf
        47 kB
        Gerhard Temper
      3. b.pdf
        26 kB
        Gerhard Temper
      4. a.pdf
        31 kB
        Gerhard Temper

        Activity

        Hide
        Ernst Eibensteiner added a comment -

        I created a short test program for that case:

        PDDocument pdf = PDDocument.load("C:\\tmp
        a.pdf");
        PDDocumentCatalog docCatalog = pdf.getDocumentCatalog();
        PDAcroForm acroForm = docCatalog.getAcroForm();
        List<PDField> list = acroForm.getFields();
        Iterator<PDField> it = list.iterator();
        while (it.hasNext())

        { PDField field = it.next(); System.out.println("FQ Name: "+field.getFullyQualifiedName()); System.out.println("Value: "+field.getValue()); }

        Output shows:
        =============
        FQ Name: Testfeld
        Value: null
        FQ Name: Testfeld2
        Value: null

        So it seems that "field.getValue()" does not work correctly or does not get the latest value from the form field.

        Show
        Ernst Eibensteiner added a comment - I created a short test program for that case: PDDocument pdf = PDDocument.load("C:\\tmp a.pdf"); PDDocumentCatalog docCatalog = pdf.getDocumentCatalog(); PDAcroForm acroForm = docCatalog.getAcroForm(); List<PDField> list = acroForm.getFields(); Iterator<PDField> it = list.iterator(); while (it.hasNext()) { PDField field = it.next(); System.out.println("FQ Name: "+field.getFullyQualifiedName()); System.out.println("Value: "+field.getValue()); } Output shows: ============= FQ Name: Testfeld Value: null FQ Name: Testfeld2 Value: null So it seems that "field.getValue()" does not work correctly or does not get the latest value from the form field.
        Hide
        Ernst Eibensteiner added a comment -

        obviously the COSDictionary does not contain the correct value:

        PDDocument pdf = PDDocument.load("C:\\tmp
        a.pdf");
        PDDocumentCatalog docCatalog = pdf.getDocumentCatalog();
        PDAcroForm acroForm = docCatalog.getAcroForm();
        System.out.println("acroForm COSObject: "+acroForm.getCOSObject().toString());
        List<PDField> list = acroForm.getFields();
        Iterator<PDField> it = list.iterator();
        while (it.hasNext())

        { PDField field = it.next(); System.out.println("fieldCOSObject: "+field.getCOSObject().toString()); }

        Output shows:
        acroForm COSObject: ..... COSName

        {Fields}

        :COSArray{[COSObject

        {31, 0}

        , COSObject

        {32, 0}

        ]})
        correct COSObject 31 and 32 found for the Fields!

        but
        COSObject 31 does not contain COSName

        {V}
        COSObject 32 does not contain COSName{V}

        if I open the same document using http://www.pdftron.com/ I can see the correct
        COSName

        {V}

        for both COSObjects (Object number 31 and 32) containing the correct value.

        Show
        Ernst Eibensteiner added a comment - obviously the COSDictionary does not contain the correct value: PDDocument pdf = PDDocument.load("C:\\tmp a.pdf"); PDDocumentCatalog docCatalog = pdf.getDocumentCatalog(); PDAcroForm acroForm = docCatalog.getAcroForm(); System.out.println("acroForm COSObject: "+acroForm.getCOSObject().toString()); List<PDField> list = acroForm.getFields(); Iterator<PDField> it = list.iterator(); while (it.hasNext()) { PDField field = it.next(); System.out.println("fieldCOSObject: "+field.getCOSObject().toString()); } Output shows: acroForm COSObject: ..... COSName {Fields} :COSArray{[COSObject {31, 0} , COSObject {32, 0} ]}) correct COSObject 31 and 32 found for the Fields! but COSObject 31 does not contain COSName {V} COSObject 32 does not contain COSName{V} if I open the same document using http://www.pdftron.com/ I can see the correct COSName {V} for both COSObjects (Object number 31 and 32) containing the correct value.
        Hide
        Ernst Eibensteiner added a comment -

        I've created a short patch that worked for me, but there is certainly a reason for the if clause!

        I found out that COSObect 31 and 32 were added twice into the COSDictionary. The first one did not contain a

        {V}

        attribute, the second one did. So the patch adds both versions into the COOSDictionary.

        Hopefully this helps.

        Show
        Ernst Eibensteiner added a comment - I've created a short patch that worked for me, but there is certainly a reason for the if clause! I found out that COSObect 31 and 32 were added twice into the COSDictionary. The first one did not contain a {V} attribute, the second one did. So the patch adds both versions into the COOSDictionary. Hopefully this helps.

          People

          • Assignee:
            Unassigned
            Reporter:
            Gerhard Temper
          • Votes:
            3 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:

              Development