Uploaded image for project: 'FOP'
  1. FOP
  2. FOP-2655

`<fo:block>` hyphenation introduced unwanted dash characters

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.1
    • None
    • None
    • DocBook XML DTD: 4.2
      DocBook XSL: 1.79.1
      XSLT Processor: XSLTproc 1.1.28
      XSL-FO Processor: Apache FOP 2.1
      Runtime: Cygwin 1.7.28 32-bit
      JRE: 1.7.0 update 10
      System: Microsoft Windows XP Professional SP3

    Description

      This problem was originally reported at:
      https://sourceforge.net/p/docbook/bugs/1386/

      I have used Apache FOP for formatting XSL-FO output of my technical book's
      DocBook XSL rendering for quite some time without major problem. But as soon
      as I turned on hyphenation wrapping option of preformatted paragraph, which
      its text contains example command line invocation, a nasty problem arose:

      Original text:

      command -S -longeropt -thisisaverylongprogramoption -optionwitharg "ARGUMENT" FILENAME1 FILENAME2
      

      XSL-FO paragraph:

      <fo:block id="idm1201955836" wrap-option="wrap" hyphenation-character="&#xBB;" text-align="start" space-before.minimum="0.8em" space-before.optimum="1em" space-before.maximum="1.2em" space-after.minimum="0.8em" space-after.optimum="1em" space-after.maximum="1.2em" hyphenate="false" white-space-collapse="false" white-space-treatment="preserve" linefeed-treatment="preserve" font-family="monospace">command&#160;&#173;-S&#160;&#173;-longeropt&#160;&#173;-thisisaverylongprogramoption&#160;&#173;-optionwitharg&#160;&#173;"ARGUMENT"&#160;&#173;FILENAME1&#160;&#173;FILENAME2</fo:block>
      

      Expected rendering output is something like:

      command -S -longeropt -thisisaverylongprogramoption -optionwitharg >>
      "ARGUMENT" FILENAME1 FILENAME2
      

      (Configured hyphenation symbol is ">>" a.k.a. "»" U+00BB)

      But the actual PDF rendering output is:

      command --S --longeropt --thisisaverylongprogramoption --
      optionwitharg -"ARGUMENT" FILENAME1 FILENAME2
      

      You would see that:

      • An extra dash is added in front of every word that originally have dash or quote in front of it.
      • There is no expected hyphenation symbol shown at the end of wrapped line.

      This results in an invalid command line example that could not be used.
      But the behavior is not consistent throughout all kind of text. For example,
      if I simply removed dashes and quotes from the source text, the paragraph
      would now format correctly:

      Original text:

      command S longeropt thisisaverylongprogramoption optionwitharg ARGUMENT FILENAME1 FILENAME2
      

      XSL-FO paragraph:

      <fo:block id="idm1203103004" wrap-option="wrap" hyphenation-character="&#xBB;" text-align="start" space-before.minimum="0.8em" space-before.optimum="1em" space-before.maximum="1.2em" space-after.minimum="0.8em" space-after.optimum="1em" space-after.maximum="1.2em" hyphenate="false" white-space-collapse="false" white-space-treatment="preserve" linefeed-treatment="preserve" font-family="monospace">command&#160;&#173;S&#160;&#173;longeropt&#160;&#173;thisisaverylongprogramoption&#160;&#173;optionwitharg&#160;&#173;ARGUMENT&#160;&#173;FILENAME1&#160;&#173;FILENAME2</fo:block>
      

      The actual PDF rendering output is:

      command S longeropt thisisaverylongprogramoption optionwitharg >>
      ARGUMENT FILENAME1 FILENAME2
      

      You would see that:

      • There is no extra dash added.
      • Expected hyphenation symbol is now shown at the end of wrapped line.

      Example DocBook XML-based test files (including full XSL-FO code, and
      PDF output) of above cases could be found at http://www.mediafire.com/?2pnw0atgetd5may.
      Inside, the first case could be found as `programlisting-extradash.*`,
      and second case could be found as `programlisting-extradash-nopunc.*`;

      Note: XSL-FO files in the link are encoded in UTF-8 and use nonprintable
      special characters. The quoted version (shown above) have them
      substituted with relevant numeric entities to improve readability.
      The rendered PDF result, in any case, is the same.

      Attachments

        Activity

          People

            Unassigned Unassigned
            nachanon Nutchanon Wetchasit
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: