Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-4532

PDFTextStripper replacing the decimal with white space

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.0.15
    • None
    • Text extraction

    Description

      I'm using the PDFTextStripperByArea to be specific and trying to extract a particular area from the document. 

      In the output most the numbers (all but one) have their decimal point replaced by a white space. When I copy and paste the text using Abobe reader/chrome the decimal point are preserved.

      Attachments

        1. SO71723006.pdf
          193 kB
          Tilman Hausherr
        2. PDFBOX-4532-reduced.pdf
          85 kB
          Tilman Hausherr
        3. numbers_without_decimal.PNG
          5 kB
          Akash Gupta
        4. FSUSA00BDD.pdf
          275 kB
          Akash Gupta
        5. code_textStripper.PNG
          10 kB
          Akash Gupta

        Issue Links

          Activity

            People

              Unassigned Unassigned
              akashsgpgi Akash Gupta
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated: