Commons Digester
  1. Commons Digester
  2. DIGESTER-77

[digester] Support line-precise error reporting

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:
      None
    • Environment:

      Operating System: other
      Platform: Other

      Description

      This patch adds a very minimal and optional SAX pipeline to Digester, allowing
      the user to inject XML transformations between the parser and Digester
      processing. The original purpose is to make it possible to create a
      LocationTransformer that injects the source, line number, and column number
      pieces of information into every element through namespaced attributes, allowing
      user to have digester create their own Location objects and use the information
      for enhanced debugging.

      Incidently, this transforming pipeline hook opens digester up for more
      interesting features like supporting multiple xml formats simultaneously by
      allowing the developer to create special transformers that trigger when an old
      format is used, transforming into the latest one.

        Activity

        Hide
        Don Brown added a comment -

        Created an attachment (id=18156)
        Patch that adds the pipeline hook, location transformer, and tests

        Show
        Don Brown added a comment - Created an attachment (id=18156) Patch that adds the pipeline hook, location transformer, and tests
        Hide
        Simon Kitching added a comment -

        Hi, I've read the posted patch and have got a few questions.

        (1)
        What does this transformer framework do that cannot be done by existing standard
        java features?

        The XMLConsumer interface is literally identical to the sax ContentHandler
        interface. It is already a well-known pattern to chain ContentHandlers together
        to form transformational pipelines (and Digester already implements
        org.xml.sax.ContentHandler because it extends DefaultHandler). So what does
        introducing the new XMLProducer/XMLConsumer interfaces give over the existing
        approach like this?

        Digester d = new Digester();
        MyTransformingContentHandler h = new MyTransformingContentHandler(d);
        xmlReader.setContentHandler(h);
        xmlReader.parse(...);

        (2)
        You mention transformers that "trigger when an old format is used". How would
        this work?

        (3)
        Minor note: the AbstractTransformer methods all check for (contentHandler !=
        null). Can't that be assumed (ie declared a precondition)? I can't see any
        justification for supporting an AbstractTransformer with no destination for its
        transformed events...

        (4)
        To get back to the original "line information" use case, it seems to me that
        it's already very simple for a user to get this info. They just write their own
        Rule class:

        public class MyLocationRule extends Rule {
        public void begin(...)

        { Locator loc = digester.getDocumentLocator(); Object target = digester.peek(); MyLocatableObject locatable = (MyLocatableObject) target; // store info (could convert to some app-specific representation // if desired) locatable.setLine(loc.getLineNumber()); locatable.setColumn(loc.getColumnNumber()); }

        }

        This allows a user to map location info onto their objects in any way they like,
        without any digester or xml classes needed on their target user objects. How is
        the current proposal an improvement on this?

        Please note I'm not trying to reject the proposed code; I just want to see some
        debate on benefits before increasing the size of the digester codebase...

        Regards,

        Simon

        Show
        Simon Kitching added a comment - Hi, I've read the posted patch and have got a few questions. (1) What does this transformer framework do that cannot be done by existing standard java features? The XMLConsumer interface is literally identical to the sax ContentHandler interface. It is already a well-known pattern to chain ContentHandlers together to form transformational pipelines (and Digester already implements org.xml.sax.ContentHandler because it extends DefaultHandler). So what does introducing the new XMLProducer/XMLConsumer interfaces give over the existing approach like this? Digester d = new Digester(); MyTransformingContentHandler h = new MyTransformingContentHandler(d); xmlReader.setContentHandler(h); xmlReader.parse(...); (2) You mention transformers that "trigger when an old format is used". How would this work? (3) Minor note: the AbstractTransformer methods all check for (contentHandler != null). Can't that be assumed (ie declared a precondition)? I can't see any justification for supporting an AbstractTransformer with no destination for its transformed events... (4) To get back to the original "line information" use case, it seems to me that it's already very simple for a user to get this info. They just write their own Rule class: public class MyLocationRule extends Rule { public void begin(...) { Locator loc = digester.getDocumentLocator(); Object target = digester.peek(); MyLocatableObject locatable = (MyLocatableObject) target; // store info (could convert to some app-specific representation // if desired) locatable.setLine(loc.getLineNumber()); locatable.setColumn(loc.getColumnNumber()); } } This allows a user to map location info onto their objects in any way they like, without any digester or xml classes needed on their target user objects. How is the current proposal an improvement on this? Please note I'm not trying to reject the proposed code; I just want to see some debate on benefits before increasing the size of the digester codebase... Regards, Simon
        Hide
        Don Brown added a comment -

        (In reply to comment #2)
        > Hi, I've read the posted patch and have got a few questions.
        >
        > (1)
        > What does this transformer framework do that cannot be done by existing standard
        > java features?

        Well, without the Generator and Serializer interfaces, I can see how Transformer
        doesn't add much to to XMLFilter. If the Transformer also combined the
        LexicalHandler, it is nice because XMLFilter as no setLexicalHandler, and also,
        XMLFilter brings a lot of baggage a fresh filter would have to implement. Still
        I could see this minor advantages outweighed by the need to keep Digester
        classes to a minimum.

        Also, you could argue Digester would be improved by splitting the handler code
        out of the Digester class, making it more pluggable and easier to read. I'd
        guess 90% of people just use its parse() methods. Furthermore, some of its
        processing could further be split into different Transformers increasing the
        flexibility. Of course this is a big change and my guess a not very likely one

        Finally, even if we used XMLFilter, I'd still argue the pipeline building code
        should be in the Digester. It is much easier to call addTransformer or
        addXMLFilter than to wire it up yourself.

        > (2)
        > You mention transformers that "trigger when an old format is used". How would
        > this work?

        Say you had a versioned configuration file, with a version attribute in the root
        element. In your new upgrade, you changed that configuration file schema and
        modified all your rules accordingly. With a well-designed pipeline, you could
        create a "switch" transformer that keyed off that version attribute. It read
        the value and dynamically inserted one or more transformations to handle the old
        format. Handling could mean automatic renaming of changed elements, adding in
        new required attributes, or outputting warnings of features not supported
        anymore. The end result is XML that is in the most current schema and can be
        read easily by your application. Before digester gets the XML though, perhaps
        you add another Transformer that validates the new XML against a DTD or some
        other schema validator.

        Ohter ideas for transformers including logging, performance timing, automatic
        backup, multi-casting the data like to class that digitally signs the XML to
        another file for security purposes, etc.

        > (3)
        > Minor note: the AbstractTransformer methods all check for (contentHandler !=
        > null). Can't that be assumed (ie declared a precondition)? I can't see any
        > justification for supporting an AbstractTransformer with no destination for its
        > transformed events...

        You could conceive of a transformer that doesn't send events along. For
        example, Digester has this built-in with the special ContentHandler setter which
        will swallow events if found. This is an example of a function that could be
        split off into its own Transformer.

        > (4)
        > To get back to the original "line information" use case, it seems to me that
        > it's already very simple for a user to get this info. They just write their own
        > Rule class:
        >
        > public class MyLocationRule extends Rule {
        > public void begin(...)

        { > Locator loc = digester.getDocumentLocator(); > Object target = digester.peek(); > MyLocatableObject locatable = (MyLocatableObject) target; > // store info (could convert to some app-specific representation > // if desired) > locatable.setLine(loc.getLineNumber()); > locatable.setColumn(loc.getColumnNumber()); > }

        > }
        >
        > This allows a user to map location info onto their objects in any way they like,
        > without any digester or xml classes needed on their target user objects. How is
        > the current proposal an improvement on this?

        This is certainly one way to do it, however speaking as a Digester newbie, I'd
        rather just use the existing rules than write custom code. Perhaps an abstract
        rule that minimizes the work and Digester knowledge required to deal with
        Locations would be easy enough. The pipeline idea is larger than this one
        feature, however.

        Again, thanks for the good feedback, and look forward to yours.

        Show
        Don Brown added a comment - (In reply to comment #2) > Hi, I've read the posted patch and have got a few questions. > > (1) > What does this transformer framework do that cannot be done by existing standard > java features? Well, without the Generator and Serializer interfaces, I can see how Transformer doesn't add much to to XMLFilter. If the Transformer also combined the LexicalHandler, it is nice because XMLFilter as no setLexicalHandler, and also, XMLFilter brings a lot of baggage a fresh filter would have to implement. Still I could see this minor advantages outweighed by the need to keep Digester classes to a minimum. Also, you could argue Digester would be improved by splitting the handler code out of the Digester class, making it more pluggable and easier to read. I'd guess 90% of people just use its parse() methods. Furthermore, some of its processing could further be split into different Transformers increasing the flexibility. Of course this is a big change and my guess a not very likely one Finally, even if we used XMLFilter, I'd still argue the pipeline building code should be in the Digester. It is much easier to call addTransformer or addXMLFilter than to wire it up yourself. > (2) > You mention transformers that "trigger when an old format is used". How would > this work? Say you had a versioned configuration file, with a version attribute in the root element. In your new upgrade, you changed that configuration file schema and modified all your rules accordingly. With a well-designed pipeline, you could create a "switch" transformer that keyed off that version attribute. It read the value and dynamically inserted one or more transformations to handle the old format. Handling could mean automatic renaming of changed elements, adding in new required attributes, or outputting warnings of features not supported anymore. The end result is XML that is in the most current schema and can be read easily by your application. Before digester gets the XML though, perhaps you add another Transformer that validates the new XML against a DTD or some other schema validator. Ohter ideas for transformers including logging, performance timing, automatic backup, multi-casting the data like to class that digitally signs the XML to another file for security purposes, etc. > (3) > Minor note: the AbstractTransformer methods all check for (contentHandler != > null). Can't that be assumed (ie declared a precondition)? I can't see any > justification for supporting an AbstractTransformer with no destination for its > transformed events... You could conceive of a transformer that doesn't send events along. For example, Digester has this built-in with the special ContentHandler setter which will swallow events if found. This is an example of a function that could be split off into its own Transformer. > (4) > To get back to the original "line information" use case, it seems to me that > it's already very simple for a user to get this info. They just write their own > Rule class: > > public class MyLocationRule extends Rule { > public void begin(...) { > Locator loc = digester.getDocumentLocator(); > Object target = digester.peek(); > MyLocatableObject locatable = (MyLocatableObject) target; > // store info (could convert to some app-specific representation > // if desired) > locatable.setLine(loc.getLineNumber()); > locatable.setColumn(loc.getColumnNumber()); > } > } > > This allows a user to map location info onto their objects in any way they like, > without any digester or xml classes needed on their target user objects. How is > the current proposal an improvement on this? This is certainly one way to do it, however speaking as a Digester newbie, I'd rather just use the existing rules than write custom code. Perhaps an abstract rule that minimizes the work and Digester knowledge required to deal with Locations would be easy enough. The pipeline idea is larger than this one feature, however. Again, thanks for the good feedback, and look forward to yours.
        Hide
        Don Brown added a comment -

        I should add I'm happy to see the line-precise feature can be done with today's
        digester. The gist of this patch then becomes let's make it easier and more
        obvious.

        Show
        Don Brown added a comment - I should add I'm happy to see the line-precise feature can be done with today's digester. The gist of this patch then becomes let's make it easier and more obvious.
        Hide
        Simon Kitching added a comment -

        (a) – new proposal

        The "custom rule" proposal I offered does have one drawback; it's necessary to
        add this custom rule at every place that a "locatable" object is created.
        Feasable, but clumsy.

        An alternative might be to add a hook to digester to get callbacks whenever an
        object is pushed onto the digester object stack. That should ensure that the
        callback is invoked whenever ObjectCreateRule, FactoryCreateRule, or any similar
        user-created rule fires. That code could then record location info (among other
        things). This would be a very simple change:

        Add interface:
        public interface StackAction

        { public void pushed(Digester d, String stackName, Object o); public void popped(Digester d, String stackName, Object o); }

        then add a StackAction member to digester, and update all the Digester push and
        pop methods to call the object if it exists.

        One possible use would be for location info; a user could write:
        public class SaveLocationAction {
        public void pushed(Digester d, String stackName, Object o) {
        if (o instanceof LocationAware)

        { LocationAware locObj = (LocationAware) o; Location l = d.getDocumentLocator(); o.setLine(l.getLine()); o.setColumn(l.getColumn()); }

        }
        }
        digester.setStackAction(new SaveLocationAction());

        (b) – feedback

        Regarding the generic transformer framework, I'm afraid I'm not convinced that
        there will be any great use of this if added. Unless someone comes along with a
        real use-case for this I would prefer not to add this to Digester; the line-info
        usecase isn't a convincing one to me as this can be solved in other (simpler)
        ways. As noted, transformations can already be done using chained ContentHandler
        objects if needed; I don't see any great reduction in complexity by using the
        custom XMLProducer/XMLConsumer interfaces.

        It's a shame that so many people are reluctant to write Digester rules. I don't
        see this as being any more unusual than writing an implementation of
        ContentHandler, or HttpServlet. I guess we need more documentation on this...

        I certainly agree that Digester's "processing" code should be split from its
        'setup' code. The digester2 prototype I've been (periodically) working on has
        this as one of its design goals. But (as noted) this isn't going to happen in
        the Digester 1.x series for backwards-compatibility reasons.

        Show
        Simon Kitching added a comment - (a) – new proposal The "custom rule" proposal I offered does have one drawback; it's necessary to add this custom rule at every place that a "locatable" object is created. Feasable, but clumsy. An alternative might be to add a hook to digester to get callbacks whenever an object is pushed onto the digester object stack. That should ensure that the callback is invoked whenever ObjectCreateRule, FactoryCreateRule, or any similar user-created rule fires. That code could then record location info (among other things). This would be a very simple change: Add interface: public interface StackAction { public void pushed(Digester d, String stackName, Object o); public void popped(Digester d, String stackName, Object o); } then add a StackAction member to digester, and update all the Digester push and pop methods to call the object if it exists. One possible use would be for location info; a user could write: public class SaveLocationAction { public void pushed(Digester d, String stackName, Object o) { if (o instanceof LocationAware) { LocationAware locObj = (LocationAware) o; Location l = d.getDocumentLocator(); o.setLine(l.getLine()); o.setColumn(l.getColumn()); } } } digester.setStackAction(new SaveLocationAction()); (b) – feedback Regarding the generic transformer framework, I'm afraid I'm not convinced that there will be any great use of this if added. Unless someone comes along with a real use-case for this I would prefer not to add this to Digester; the line-info usecase isn't a convincing one to me as this can be solved in other (simpler) ways. As noted, transformations can already be done using chained ContentHandler objects if needed; I don't see any great reduction in complexity by using the custom XMLProducer/XMLConsumer interfaces. It's a shame that so many people are reluctant to write Digester rules. I don't see this as being any more unusual than writing an implementation of ContentHandler, or HttpServlet. I guess we need more documentation on this... I certainly agree that Digester's "processing" code should be split from its 'setup' code. The digester2 prototype I've been (periodically) working on has this as one of its design goals. But (as noted) this isn't going to happen in the Digester 1.x series for backwards-compatibility reasons.
        Hide
        Simon Kitching added a comment -

        As there's been no negative comment on my proposal for a "StackAction" class for
        digester (actually, no comment at all), I've added it. The svn commit number is
        r398306.

        In particular, see the new LocationTrackerTestCase class which demonstrates how
        to use this to track xml location for created objects.

        As always, comment is welcome.

        I've also added javadoc to the Rule class to encourage people to write custom
        Rule subclasses; the rules provided by Digester were never expected to cover all
        possible situations.

        Regards,

        Simon

        Show
        Simon Kitching added a comment - As there's been no negative comment on my proposal for a "StackAction" class for digester (actually, no comment at all), I've added it. The svn commit number is r398306. In particular, see the new LocationTrackerTestCase class which demonstrates how to use this to track xml location for created objects. As always, comment is welcome. I've also added javadoc to the Rule class to encourage people to write custom Rule subclasses; the rules provided by Digester were never expected to cover all possible situations. Regards, Simon
        Hide
        Don Brown added a comment -

        Whether or not you use the XMLProducer/XMLConsumer interfaces, if you are
        wanting to separate digester setup from processing (i.e. Digester doesn't extend
        DefaultHandler), you will need to build in pipeline support, and IMO, you might
        as well start now. The change to Digester was very small, and saves a good
        amount of coding on the client side. In my view, rules and pipelines are
        orthogonal - you use pipelines to layer in changes to the XML, while rules do
        the final XML-to-object mapping.

        Anyways, the StackAction interface sounds useful, so I might give it a shot in
        Struts. Thanks.

        Show
        Don Brown added a comment - Whether or not you use the XMLProducer/XMLConsumer interfaces, if you are wanting to separate digester setup from processing (i.e. Digester doesn't extend DefaultHandler), you will need to build in pipeline support, and IMO, you might as well start now. The change to Digester was very small, and saves a good amount of coding on the client side. In my view, rules and pipelines are orthogonal - you use pipelines to layer in changes to the XML, while rules do the final XML-to-object mapping. Anyways, the StackAction interface sounds useful, so I might give it a shot in Struts. Thanks.

          People

          • Assignee:
            Unassigned
            Reporter:
            Don Brown
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development