Commons Digester
  1. Commons Digester
  2. DIGESTER-161

Document thread-safety in javadoc of Rule class

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Trivial Trivial
    • Resolution: Fixed
    • Affects Version/s: 3.1
    • Fix Version/s: None
    • Labels:

      Description

      I discovered a problem today with some code that was reusing a custom Rule in multiple threads, even though each thread was creating its own digester. It seems that Digester.addRule is calling rule.setDigester and if the rule is shared across multiple threads, the calls to begin/end can get tangled across threads.

      It is obvious that Rules are not meant to be shared, but the javadoc <http://commons.apache.org/digester/apidocs/org/apache/commons/digester3/Rule.html> seems to be implying the opposite and is confusing at best. It talks about the rules being stateless, even though the framework itself is changing its state with rule.setDigester(digester). It further states that since all state is part of the digester, the rule is safe under all cases, which is very misleading.

      " ... Rule objects should be stateless, ie they should not update any instance member during the parsing process. A rule instance that changes state will encounter problems if invoked in a "nested" manner; this can happen if the same instance is added to digester multiple times or if a wildcard pattern is used which can match both an element and a child of the same element. The digester object stack and named stacks should be used to store any state that a rule requires, making the rule class safe under all possible uses. ..."

      I think the statement above should be reworded to be more correct and avoid confusion. Down the line, maybe the digester accessed by the rule should be a ThreadLocal.

        Activity

        Hide
        Simone Tripodi added a comment -

        Nice to see someone deeply debugging the Digester

        I personally avoid that kind of use of the Rule instance because of non-thread safety-ness nature of the Digester instance, that is why in digester3 I thought adding the org.apache.commons.digester3.binder.RuleProvider in the binder API, to make sure that Rule instances are not shared across different Digester instances.

        A typical use case - maybe like yours is - you need custom inject dependencies in the custom Rule implementation, let's assume you are parsing an xml feed to ingest content inside a database:

        public class MyCustomRule
            extends Rule
        {
        
            private final Connection connection;
        
            public MyCustomRule( Connection connection )
            {
               this.connection = connection;
            }
        }
        

        the related provider implementation would be:

        public class MyCustomRuleProvider
            implements RuleProvider<MyCustomRule>
        {
        
            private final DataSource dataSource;
        
            public MyCustomRuleProvider( DataSource dataSource )
            {
                this.dataSource = dataSource;
            }
        
            public MyCustomRule get()
            {
                return new MyCustomRule( dataSource.getConnection() );
            }
        
        }
        

        then, in the RulesModule:

        public class MyRulesModule
            extends AbstractRulesModule
        {
        
            protected void configure()
            {
                forPattern( "my/rule/path" ).addRuleCreatedBy( new MyCustomRuleProvider( dataSourceReference ) );
            }
        
        }
        

        said that, we have two options - I am open to both:

        • as you suggested, updating the Rule class in oder to store digester and namespaceURI references in ThreadLocals

        or

        • updating the documentation according to current behavior - since I am not a native English speaker, I would like you to provide a patch to apply, please

        Thanks for participating!
        -Simo

        Show
        Simone Tripodi added a comment - Nice to see someone deeply debugging the Digester I personally avoid that kind of use of the Rule instance because of non-thread safety-ness nature of the Digester instance, that is why in digester3 I thought adding the org.apache.commons.digester3.binder.RuleProvider in the binder API, to make sure that Rule instances are not shared across different Digester instances. A typical use case - maybe like yours is - you need custom inject dependencies in the custom Rule implementation, let's assume you are parsing an xml feed to ingest content inside a database: public class MyCustomRule extends Rule { private final Connection connection; public MyCustomRule( Connection connection ) { this .connection = connection; } } the related provider implementation would be: public class MyCustomRuleProvider implements RuleProvider<MyCustomRule> { private final DataSource dataSource; public MyCustomRuleProvider( DataSource dataSource ) { this .dataSource = dataSource; } public MyCustomRule get() { return new MyCustomRule( dataSource.getConnection() ); } } then, in the RulesModule : public class MyRulesModule extends AbstractRulesModule { protected void configure() { forPattern( "my/rule/path" ).addRuleCreatedBy( new MyCustomRuleProvider( dataSourceReference ) ); } } said that, we have two options - I am open to both: as you suggested, updating the Rule class in oder to store digester and namespaceURI references in ThreadLocals or updating the documentation according to current behavior - since I am not a native English speaker, I would like you to provide a patch to apply, please Thanks for participating! -Simo
        Hide
        Eduard Papa added a comment -

        The code I was referring to was actually using digester 2.x, so I don't think the provider method is there. It was just creating digester and adding all the rules. The rule that was causing the problem was a static final...hence the thread-safety issue.

        I would have liked someone who knows more about Digester to update the javadoc but I'll give a try....tomorrow hopefully.

        Show
        Eduard Papa added a comment - The code I was referring to was actually using digester 2.x, so I don't think the provider method is there. It was just creating digester and adding all the rules. The rule that was causing the problem was a static final...hence the thread-safety issue. I would have liked someone who knows more about Digester to update the javadoc but I'll give a try....tomorrow hopefully.
        Hide
        Eduard Papa added a comment - - edited

        I have attached an improved javadoc for the Rule class. Please review it and commit the change.

        Show
        Eduard Papa added a comment - - edited I have attached an improved javadoc for the Rule class. Please review it and commit the change.
        Hide
        Eduard Papa added a comment -

        Can someone commit the new javadoc I attached, assign it to a version, and mark it as resolved. Thanks

        Show
        Eduard Papa added a comment - Can someone commit the new javadoc I attached, assign it to a version, and mark it as resolved. Thanks
        Hide
        Simone Tripodi added a comment -

        fixed on r1239129.

        For future releases, please read the On Contributing Patches short guide.

        Thanks for contributing

        Show
        Simone Tripodi added a comment - fixed on r1239129 . For future releases, please read the On Contributing Patches short guide. Thanks for contributing

          People

          • Assignee:
            Simone Tripodi
            Reporter:
            Eduard Papa
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - 1h
              1h
              Remaining:
              Remaining Estimate - 1h
              1h
              Logged:
              Time Spent - Not Specified
              Not Specified

                Development