Uploaded image for project: 'FOP'
  1. FOP
  2. FOP-2466

Improve output for pre-hyphenated text with SHY combined with hyphenation properties

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 1.1
    • None
    • layout/line

    Description

      When processing a FO file that contains pre-hyphenated text, using soft-hyphens, FOP's hyphenation does not yield usable results.

      From the corresponding thread on fop-users@:


      The accumulated sequence of characters since the previous break opportunity is taken to be a 'word', which may or may not end in a hyphen. If the latter is true, a specific sequence of elements is glued to the word-box, to prevent a break before SHY and make sure that it is properly rendered, i.e. only counts if the break occurs right after.

      As hyphenation by FOP itself is applied at a higher level, when all layout elements for a whole paragraph have been collected, that SHY sequence is seen as a word boundary. That is, that part of the algorithm just accumulates the text for ‘uninterrupted' sequences of word-boxes, and feeds those pieces to the hyphenator. The real intention is to apply hyphenation across any nested fo:inlines. ‘Uninterrupted’ means that auxiliary elements, generated for border or padding are explicitly not considered as word boundaries. The sequence generated for SHY contains two non-auxiliary elements, as if it were a space. Perhaps, just to ensure that that position in the layout always leads to a character that is visibly rendered.

      In case of pre-hyphenated text, this has the unintended effect of restricting the input for the hyphenator to parts of words, which is basically meaningless (and wasteful).

      Amongst others, this leads to the "hyphenation-ladder-count" property having seemingly no effect.

      Note - At this point, I believe the behaviour is not necessarily incorrect. I am also thinking that it would be correct to ignore hyphenation-ladder-count in case hyphenation="false".

      Initial idea for a fix:
      Make sure that the SHY sequence is not treated as a word boundary in LineLM when accumulating text for boxes generated by the TextLMs. Once done, we should then be able to check for each hyphenation point that FOP itself calculates, whether there is already an explicit SHY present at that same point. In that case, we can just do nothing (= leave the SHY in place).

      Attachments

        Activity

          People

            Unassigned Unassigned
            adelmelle Andreas L. Delmelle
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated: