Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-915

some pdf file for chinese can't extracted by correct encode

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Invalid
    • 1.3.1, 2.0.0
    • None
    • Text extraction
    • None
    • jdk1.5

    Description

      i used the PDFTextStripper to extracted the contents of pdf which include chinese code ,some file can extracted correct ,but some is extracted with wrong code.

      Attachments

        1. 821-2302.pdf
          835 kB
          chenlong

        Activity

          People

            lehmi Andreas Lehmkühler
            nirvanasob chenlong
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: