Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-4185

Fetching options for PDChoice causes ClassCastException

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.0.4, 2.0.9, 3.0.0 PDFBox
    • Fix Version/s: 2.0.10, 3.0.0 PDFBox
    • Component/s: AcroForm
    • Labels:
      None

      Description

      I am trying to fetch the options available for a PDChoice field in a form but get a ClassCastException from the PDFBox internals.

      The problematic PDF is an Inheritance Tax form from the UK's Revenue and Customs, specifically I am currently looking at IHT405:

      https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/697346/IHT405_online.pdf

      I use this code to iterate over the fields:

      	PDDocument doc = PDDocument.load(resource.getFile());
      	PDDocumentCatalog catalog = doc.getDocumentCatalog();
      	PDAcroForm form = catalog.getAcroForm();
      	for (PDField field : form.getFields()) {
      		if ("Ch".equals(field.getFieldType())) {
      			PDChoice choice = (PDChoice) field;
      			// All these variants fail with a ClassCastException:
      			choice.getOptions();
      			choice.getOptionsDisplayValues();
      			choice.getOptionsExportValues(); // internally just delegates to getOptions()
      		}
      	}
      

      This is a stacktrace for e.g. the getOptionsExportValues() call:

      	java.lang.ClassCastException: org.apache.pdfbox.cos.COSArray cannot be cast to org.apache.pdfbox.cos.COSString
      		at org.apache.pdfbox.pdmodel.common.COSArrayList.convertCOSStringCOSArrayToList(COSArrayList.java:367)
      		at org.apache.pdfbox.pdmodel.interactive.form.FieldUtils.getPairableItems(FieldUtils.java:182)
      		at org.apache.pdfbox.pdmodel.interactive.form.PDChoice.getOptions(PDChoice.java:91)
      		at org.apache.pdfbox.pdmodel.interactive.form.PDChoice.getOptionsExportValues(PDChoice.java:210)
      
      

      The problem is that the expected "stringArray" also contains COSArrays with value and label for the options:

      	COSArray{[COSString{ }, COSArray{[COSString{Mr}, COSString{MR}]}, COSArray{[COSString{Mrs}, COSString{MRS}]}, COSArray{[COSString{Miss}, COSString{MISS}]}, COSArray{[COSString{Ms}, COSString{MS}]}]}
      

      This does not seem to be expected in FieldUtils.getPairableItems, which introspects only the first item of the array and thus treats the array as an array of strings.

      I found the bug with PDFBox 2.0.4 and upgraded to 2.0.9 which didn't help.

        Attachments

          Activity

            People

            • Assignee:
              msahyoun Maruan Sahyoun
              Reporter:
              msahyoun Maruan Sahyoun
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: