Axiom
  1. Axiom
  2. AXIOM-406

Wrong and strange behavior in escaping special symbols

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Invalid
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      When I parse a node which is like (test example):

      <node>< < > ></node>

      toString() returns "<node>< < > ></node>"
      getText() returns "< < > >"

      Why getText unescapes the special symbols?
      Why toString() unescapes the greater-than symbols and yet leaves < unchanged and < gets converted to < ?

      Is this correct?

        Activity

        Juan Miguel Cejuela created issue -
        Juan Miguel Cejuela made changes -
        Field Original Value New Value
        Summary Wrong escaping of special symbols Wrong and strange behavior in escaping special symbols
        Juan Miguel Cejuela made changes -
        Description When I parse a node which is like (test example):

        <node>&#60; &lt; &gt; &#62;</node>

        toString() returns "<node>&lt; &lt; > ></node>"
        getText() returns "< < > >"


        Why getText unescapes the special symbols?
        Why toString() unescapes the greater-than symbols and yet leaves &lt; unchanged and &#60; gets converted to &lt; ?
        When I parse a node which is like (test example):

        <node>&#60; &lt; &gt; &#62;</node>

        toString() returns "<node>&lt; &lt; > ></node>"
        getText() returns "< < > >"


        Why getText unescapes the special symbols?
        Why toString() unescapes the greater-than symbols and yet leaves &lt; unchanged and &#60; gets converted to &lt; ?

        Is this correct?
        Hide
        Andreas Veithen added a comment -

        The document consists of a single element that has a single child which is a text node with value "< < > >". That value is returned by getText(). When the object model for this document is serialized again (which is what toString() does), '<' needs to be replaced by an entity, but replacing '>' is optional. There is nothing wrong or strange here.

        Show
        Andreas Veithen added a comment - The document consists of a single element that has a single child which is a text node with value "< < > >". That value is returned by getText(). When the object model for this document is serialized again (which is what toString() does), '<' needs to be replaced by an entity, but replacing '>' is optional. There is nothing wrong or strange here.
        Andreas Veithen made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Invalid [ 6 ]
        Hide
        Juan Miguel Cejuela added a comment -

        Thanks for the quick reply. May I ask why in the serialization the replacement of ">" symbols is optional?

        Show
        Juan Miguel Cejuela added a comment - Thanks for the quick reply. May I ask why in the serialization the replacement of ">" symbols is optional?
        Hide
        Andreas Veithen added a comment -

        Because of the following production in the XML specs:

        CharData ::= [^<&]* - ([^<&]* ']]>' [^<&]*)

        Show
        Andreas Veithen added a comment - Because of the following production in the XML specs: CharData ::= [^<&] * - ( [^<&] * ']]>' [^<&] *)
        Andreas Veithen made changes -
        Labels escape text unescape xml

          People

          • Assignee:
            Unassigned
            Reporter:
            Juan Miguel Cejuela
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development