Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-1242

Handle non ISO-8859-1 chars with drawString

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.5.0, 1.6.0
    • Fix Version/s: 2.0.0
    • Component/s: Writing
    • Labels:
      None

      Description

      The PDPageContentStream.drawString take a String as argument, it construct a COSString of the input.
      If the input contain chars above 255, the COSString is prefixed 0xFe, 0xff and the bytes are taken from the
      input as "UTF-16BE" encoded.

      Back in the drawString method this unicode16 encoded COSString is appended as a "ISO-8859-1"

      appendRawCommands( new String( buffer.toByteArray(), "ISO-8859-1"));

      The result of this is that a line with UTF-16 chars is shown prefix with þÿ, and with double space between the other chars.
      The chars above 255 are shown as the two corresponding ISO-8859-1 characters.

      As a side question to this observation, is there an alternative way to use Pdfbox, to support UTF16?

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                jahewson John Hewson
                Reporter:
                peterandersen Peter Andersen
              • Votes:
                1 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: