Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-932

Swedish characters are garbled in form

VotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.4.0
    • 2.0.0
    • AcroForm
    • Mac OSX, Java6

    Description

      When using swedish characters to fill in a form they show up garbled in the PDF. This seems to have to do with the PDAppearance class. When calling setValue on the field, the value seems to be set ok since COSString handles characters outside ASCII in its writePDF method. When PDAppearance writes the value in insertGeneratedAppearance it does not do the same check. If the same check is done it seems to work for PDAppearance to (see patch below). Since I do not know very much about the PDF format, I dont know if this is the right way to do it...

      PDDocument document = PDDocument.load(<pdf-file>);
      PDDocumentCatalog docCatalog = document.getDocumentCatalog();
      PDAcroForm form = docCatalog.getAcroForm();
      PDField field = form.getField(<field name>);
      field.setValue("åäö");

      @@ -400,9 +401,32 @@

      { throw new IOException( "Error: Unknown justification value:" + q ); }
      • printWriter.println("(" + value + ") Tj");
      • printWriter.println("ET" );
      • printWriter.flush();
        + boolean outsideASCII = false;
        + byte[] bytes = value.getBytes("ISO-8859-1");
        + int length = bytes.length;
        +
        + for( int i=0; i<length && !outsideASCII; i++ )
        + { + //if the byte is negative then it is an eight bit byte and is + //outside the ASCII range. + outsideASCII = bytes[i] <0; + }

        + if(!outsideASCII)

        { + printWriter.println("(" + value + ") Tj"); + printWriter.println("ET" ); + printWriter.flush(); + }

        else

        Unknown macro: {+ printWriter.print("<");+ for(int i=0; i<length; i++ )+ { + String val = COSHEXTable.HEX_TABLE[ (bytes[i]+256)%256 ]; + printWriter.write(val); + }+ printWriter.println("> Tj");+ printWriter.println("ET" );+ printWriter.flush(); + }

        }

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            msahyoun Maruan Sahyoun
            parwen Pär Wenåker
            Votes:
            1 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment