Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-957

Text extraction using ExtractText (pdf file is input file) generates some weired characters

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Not A Problem
    • 1.4.0
    • None
    • Text extraction
    • Windows 7

    Description

      When I tried to extract text from pdf document it is generating some gibberish text.
      ExtractText.exe "\Jobvite\Resumes\Resume-Boston.pdf Resume-Boston.txt

      Will provide the pdf documents when requested, I could not find a way to include attachments.

      Attachments

        1. Resume1.pdf
          68 kB
          Ashok Chigullapally
        2. Resume2.pdf
          68 kB
          Ashok Chigullapally

        Activity

          People

            lehmi Andreas Lehmkühler
            ashokc Ashok Chigullapally
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: