Tapestry 5
  1. Tapestry 5
  2. TAP5-840

Support character references in tml files with HTML 5 Doctype

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 5.3, 5.2.6, 5.1.0.5, 5.0.18
    • Fix Version/s: 5.3
    • Component/s: tapestry-core
    • Labels:
      None

      Description

      Currently to support HTML character references (e.g. ©) you need to put a HTML Doctype at the top of the TML file.

      e.g. <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

      However for HTML 5 they have stopped using XML doctypes and instead use
      <!DOCTYPE html>

      If you change tapestry page to use this you can no longer use entities as the XML parser doesn't know what to do.

      Ideally there should be some kind of logic that detects <!DOCTYPE html> and include a suitable DTD to resolve the common HTML entities. The HTML 5 specification defines the allowed named character references - http://dev.w3.org/html5/spec/Overview.html#named-character-references. There doesn't seem to be a DTD of allowed references maintained anymore.

      1. patch.txt
        17 kB
        Robin Komiwes

        Activity

        Hide
        Massimo Lusetti added a comment -

        Doesn't this affect 5.1 and trunk too?

        Show
        Massimo Lusetti added a comment - Doesn't this affect 5.1 and trunk too?
        Hide
        Ben Gidley added a comment -

        It does effect 5.1. Not sure about Trunk but I see no reason why it wouldn't.

        Show
        Ben Gidley added a comment - It does effect 5.1. Not sure about Trunk but I see no reason why it wouldn't.
        Hide
        Robin Komiwes added a comment -

        As I needed HTML5 for some projects, I made the fix.

        I've explored many ways, such as DTDOverriding , custom external entity resolving...Without success. I also tried to use TemplateParserImpl.resolveEntity() and to add a hack for HTML5 like it's made for HTML4. But still no success.

        Finally, I tried with XMLInputFactory.IS_REPLACING_ENTITY_REFERENCES property set to false , a new TemplateToken implementation for entities, and it works.

        Now, every entities, with or without a doctype set, are now simply output "as it".

        I created a new symbol to offer the possibility to go back to the old system.

        PFA the patch with tests.

        Show
        Robin Komiwes added a comment - As I needed HTML5 for some projects, I made the fix. I've explored many ways, such as DTDOverriding , custom external entity resolving...Without success. I also tried to use TemplateParserImpl.resolveEntity() and to add a hack for HTML5 like it's made for HTML4. But still no success. Finally, I tried with XMLInputFactory.IS_REPLACING_ENTITY_REFERENCES property set to false , a new TemplateToken implementation for entities, and it works. Now, every entities, with or without a doctype set, are now simply output "as it". I created a new symbol to offer the possibility to go back to the old system. PFA the patch with tests.
        Hide
        Bob Harner added a comment -

        The supplied patch has grown stale and no longer applies cleanly against the trunk (but could probably be fixed). I guess this issue is awaiting consideration of whether to write a simple custom XML parser to replace the SAX parser (see http://tapestry.1045711.n5.nabble.com/State-on-HTML5-integration-woodstox-rollback-td2470926.html).

        As a workaround, using entity numbers instead of names (© instead of ©, ™ instead of ™, etc.) works fine with the HTML5 doctype.

        Show
        Bob Harner added a comment - The supplied patch has grown stale and no longer applies cleanly against the trunk (but could probably be fixed). I guess this issue is awaiting consideration of whether to write a simple custom XML parser to replace the SAX parser (see http://tapestry.1045711.n5.nabble.com/State-on-HTML5-integration-woodstox-rollback-td2470926.html ). As a workaround, using entity numbers instead of names (© instead of ©, ™ instead of ™, etc.) works fine with the HTML5 doctype.
        Hide
        Lenny Primak added a comment -

        The problem with using entity numbers instead of names is that graphical web page editors (e.g. Dreamweaver)
        still put   and such in the html code, so right now html5 templates are not editable by graphic designers,
        so the whole point of tapestry templates being editable by designers is broken with html5 right now.

        Show
        Lenny Primak added a comment - The problem with using entity numbers instead of names is that graphical web page editors (e.g. Dreamweaver) still put   and such in the html code, so right now html5 templates are not editable by graphic designers, so the whole point of tapestry templates being editable by designers is broken with html5 right now.
        Hide
        Howard M. Lewis Ship added a comment -

        I'm taking a crack at this as an ugly, but effective hack.

        Before parsing the template, we look at the very first line, to see if it is exactly:

        <!DOCTYPE html>

        If it is, then we substitute an alternate doctype:

        <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

        ... this involves reading all the other lines of the original template into memory and forming an input stream around the buffer. From the XML parser's point of view, the template has the transitional doctype INCLUDING the public/system ids.

        Show
        Howard M. Lewis Ship added a comment - I'm taking a crack at this as an ugly, but effective hack. Before parsing the template, we look at the very first line, to see if it is exactly: <!DOCTYPE html> If it is, then we substitute an alternate doctype: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> ... this involves reading all the other lines of the original template into memory and forming an input stream around the buffer. From the XML parser's point of view, the template has the transitional doctype INCLUDING the public/system ids.
        Hide
        Hudson added a comment -

        Integrated in tapestry-trunk-freestyle #534 (See https://builds.apache.org/job/tapestry-trunk-freestyle/534/)
        TAP5-840: Add Doctype component for greater control of rendered <!DOCTYPE>
        TAP5-840: Support character references in tml files with HTML 5 Doctype

        hlship : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1174466
        Files :

        • /tapestry/tapestry5/trunk/tapestry-core/src/main/java/org/apache/tapestry5/corelib/components/Doctype.java
        • /tapestry/tapestry5/trunk/tapestry-core/src/main/java/org/apache/tapestry5/dom/DTD.java
        • /tapestry/tapestry5/trunk/tapestry-core/src/main/java/org/apache/tapestry5/dom/Document.java
        • /tapestry/tapestry5/trunk/tapestry-core/src/main/java/org/apache/tapestry5/internal/structure/DTDPageElement.java
        • /tapestry/tapestry5/trunk/tapestry-core/src/test/groovy/org/apache/tapestry5/integration/app1/DoctypeTests.groovy
        • /tapestry/tapestry5/trunk/tapestry-core/src/test/java/org/apache/tapestry5/integration/app1/components/Border.java
        • /tapestry/tapestry5/trunk/tapestry-core/src/test/resources/org/apache/tapestry5/integration/app1/components/Border.tml

        hlship : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1174459
        Files :

        • /tapestry/tapestry5/trunk/tapestry-core/src/main/java/org/apache/tapestry5/internal/services/XMLTokenStream.java
        • /tapestry/tapestry5/trunk/tapestry-core/src/test/java/org/apache/tapestry5/internal/services/TemplateParserImplTest.java
        • /tapestry/tapestry5/trunk/tapestry-core/src/test/resources/org/apache/tapestry5/internal/services/html5_with_entities.tml
        Show
        Hudson added a comment - Integrated in tapestry-trunk-freestyle #534 (See https://builds.apache.org/job/tapestry-trunk-freestyle/534/ ) TAP5-840 : Add Doctype component for greater control of rendered <!DOCTYPE> TAP5-840 : Support character references in tml files with HTML 5 Doctype hlship : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1174466 Files : /tapestry/tapestry5/trunk/tapestry-core/src/main/java/org/apache/tapestry5/corelib/components/Doctype.java /tapestry/tapestry5/trunk/tapestry-core/src/main/java/org/apache/tapestry5/dom/DTD.java /tapestry/tapestry5/trunk/tapestry-core/src/main/java/org/apache/tapestry5/dom/Document.java /tapestry/tapestry5/trunk/tapestry-core/src/main/java/org/apache/tapestry5/internal/structure/DTDPageElement.java /tapestry/tapestry5/trunk/tapestry-core/src/test/groovy/org/apache/tapestry5/integration/app1/DoctypeTests.groovy /tapestry/tapestry5/trunk/tapestry-core/src/test/java/org/apache/tapestry5/integration/app1/components/Border.java /tapestry/tapestry5/trunk/tapestry-core/src/test/resources/org/apache/tapestry5/integration/app1/components/Border.tml hlship : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1174459 Files : /tapestry/tapestry5/trunk/tapestry-core/src/main/java/org/apache/tapestry5/internal/services/XMLTokenStream.java /tapestry/tapestry5/trunk/tapestry-core/src/test/java/org/apache/tapestry5/internal/services/TemplateParserImplTest.java /tapestry/tapestry5/trunk/tapestry-core/src/test/resources/org/apache/tapestry5/internal/services/html5_with_entities.tml
        Hide
        Hudson added a comment -

        Integrated in tapestry-trunk-freestyle #550 (See https://builds.apache.org/job/tapestry-trunk-freestyle/550/)
        TAP5-840: Remove accidental import of org.apache.commons.io.output.ByteArrayOutputStream

        hlship : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1176623
        Files :

        • /tapestry/tapestry5/trunk/tapestry-core/src/main/java/org/apache/tapestry5/internal/services/XMLTokenStream.java
        Show
        Hudson added a comment - Integrated in tapestry-trunk-freestyle #550 (See https://builds.apache.org/job/tapestry-trunk-freestyle/550/ ) TAP5-840 : Remove accidental import of org.apache.commons.io.output.ByteArrayOutputStream hlship : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1176623 Files : /tapestry/tapestry5/trunk/tapestry-core/src/main/java/org/apache/tapestry5/internal/services/XMLTokenStream.java

          People

          • Assignee:
            Howard M. Lewis Ship
            Reporter:
            Ben Gidley
          • Votes:
            10 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development