Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-4431

PDFBox recognizes only a few words

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Not A Bug
    • None
    • None
    • None
    • OS: Windows 10.
      IDE: Oxygen.3a Release (4.7.3a)
      PDF version: Adobe Acrobat Pro DC - 2019.010.20069.49826

    Description

      The code I have posted takes in 5 arguments which include the location to a pdf document and a search term. The code is to parse through the PDF document and return all the matches to the keyword in the document and return their locations depending on the format (last given argument).

      The code for some reason recognizes only a few words and errors on other words. I am not sure why this is.

      There seems to be no difference in these words in terms of font, size location etc.

      Attachments

        1. RS13170.pdf
          11.77 MB
          Krutheeka Rajkumar
        2. RS13170.txt
          80 kB
          Tilman Hausherr

        Activity

          People

            Unassigned Unassigned
            K_TorontoVic Krutheeka Rajkumar
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: