Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-957

Text extraction using ExtractText (pdf file is input file) generates some weired characters

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Not A Problem
    • Affects Version/s: 1.4.0
    • Fix Version/s: None
    • Component/s: Text extraction
    • Environment:
      Windows 7

      Description

      When I tried to extract text from pdf document it is generating some gibberish text.
      ExtractText.exe "\Jobvite\Resumes\Resume-Boston.pdf Resume-Boston.txt

      Will provide the pdf documents when requested, I could not find a way to include attachments.

        Attachments

        1. Resume2.pdf
          68 kB
          Ashok Chigullapally
        2. Resume1.pdf
          68 kB
          Ashok Chigullapally

          Activity

            People

            • Assignee:
              lehmi Andreas Lehmkühler
              Reporter:
              ashokc Ashok Chigullapally
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: