Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-1123

Not able to read field values from a PDF File if the field contains special characters.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Not A Problem
    • None
    • None
    • None

    Description

      Hi, I am trying to read the field names in a PDF file, it is working with most of the files, but in some files we are not able to read the field Id/name, the reason being we have some field names as -
      topmostSubform[0].Page1[0].c1_04_0_[0]
      topmostSubform[0].Page1[0].c1_09_0_
      topmostSubform[0].Page2[0].Table_Line4a[0].#subform[1].p2-t69[0]
      Here all the field names starts with topmostSubform[0]. so when we try to get the field names like PDField.getpartialname() - the field name is getting truncated at '.' and we get only - topmostSubform[0] and since all the field names starts with the same name the total count of fields are coming as 1. Since there are some special characters like '.'; '_'; '#' this is causing the issue. Could you please suggest on this? This is very critical.

      Attachments

        1. fspl.pdf
          361 kB
          Rubesh MX

        Activity

          People

            Unassigned Unassigned
            hellorubesh Rubesh MX
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 12h
                12h
                Remaining:
                Remaining Estimate - 12h
                12h
                Logged:
                Time Spent - Not Specified
                Not Specified