Uploaded image for project: 'Maven Doxia'
  1. Maven Doxia
  2. DOXIA-480

XhtmlBaseParser ignores XHTML default entities

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.4
    • 1.4
    • Core, Module - Xhtml
    • None
    • Patch

    Description

      XHTML defines a number of default entities that can appear in valid XHTML files (http://www.w3.org/TR/xhtml1/#h-A2), such as left/right quotes: “, ’, and many others.

      XhtmlBaseParser, however, ignores XHTML default entities appearing in the source code. This is because it delegates the parsing to AbstractXmlParser, which uses vanilla MXParser to parse. MXParser only recognises default XML entities.

      Because the HTML entities are not resolved by the XML parser, and thus by the XHTML parser, they are not rendered by the XHTML module. I have attached a sample project for Maven site that uses XHTML module. The source file has double/single quotes, however the output file does not.

      This also affects other parsers that extend XhtmlParser, e.g. MarkdownParser (see DOXIA-473 for a reported bug). This is because Pegdown library, used to parse Markdown, generates “ for quotes and other entities.

      I have attached a patch that fixes this problem. It exposes the XmlPullParser (MXParser) for configuration before parsing, so that extending classes could define default entities. Then XhtmlBaseParser adds default XHTML entities to the parser. This patch will also fix DOXIA-473, because MarkdownParser extends XhtmlParser.

      Attachments

        1. doxia-core-XhtmlBaseParser.patch
          20 kB
          Andrius Velykis
        2. doxia-core-XhtmlBaseParser.patch
          18 kB
          Andrius Velykis
        3. doxia-xhtml-entities-bug.zip
          2 kB
          Andrius Velykis

        Issue Links

          Activity

            People

              olamy Olivier Lamy
              andrius.velykis Andrius Velykis
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: