Uploaded image for project: 'Commons Digester'
  1. Commons Digester
  2. DIGESTER-96

Multiple element body parts problem

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 1.1
    • None
    • None
    • Operating System: All
      Platform: All

    • 3893

    Description

      Hi,

      I have this problem when using Digester to parse my XML file.
      Suppose we have the following XML structure:

      <parent>
      firstPart
      <child>
      child1
      </child>
      secondPart
      <child>
      child2
      </child>
      lastPart
      </parent>

      The corresponding DTD structure would be:

      <!ELEMENT parent (#PCDATA | child)*>
      <!ELEMENT child (#PCDATA)>

      I saw that the body() method of a Rule object is called only when the end
      of the matching pattern is encountered, and that's why we can only retrieve
      the "lastPart" portion of the <parent> element body, presented in the
      previous example.
      Even more, in such a case, the body() method would receive "child2lastPart"
      as parameter.
      I think this is because we assumed that an element would never have split
      body content like the one I constructed.

      I propose a solution in witch to call the body() method of a rule
      on characters() event of the matching element pattern,
      rather than on endElement(), in order to be able to treat all the body chunks.

      What do you think about this ?

      Thanks!

      Best regards,
      Teodor Danciu

      Attachments

        Activity

          People

            Unassigned Unassigned
            teodord@hotmail.com Teodor Danciu
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: