Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.5
    • Fix Version/s: None
    • Labels:
      None
    • Environment:

      Operating System: other
      Platform: Other

      Description

      While testing our application we ran into a strange Digester parse issue.
      It looks like the Digester sometimes forgets to parse a value in the xml. Here
      the situation:

      1. If we fire testscript A that doesnot comply to the schema we set on the
      Digester then we get a parsing error as expected. The error is that field Z in
      the xml was not valid.
      2. We fire testscript B which should return an answer. The first time we fire
      it the Digester doesnot map field Z (which now has a valid value) to the java
      class as defined in the rule file.
      3. We fire testscript B again unchanged and now field Z is mapped by the
      Digester to the correct attribute in the corresponding java class.

      If at point 1 we dont fire testscript A (with the invalid value for attribute Z)
      but say C or any other this doesnot occur and we get the reply we expect......

      It seems like that after a call which results in a SAXException due to an
      invalid value in the XML according to the attached schema the next call fails
      to parse the xml correctly to the java object defined in the rule file. The
      third call however (which is exactly the same as the second) succeeds.

      Any idea's?

      Regards,
      Lars Vonk

        Activity

        Hide
        lars added a comment -

        Adding clear() solved it!
        Thanks a lot for your help.

        Regards Lars Vonk

        Show
        lars added a comment - Adding clear() solved it! Thanks a lot for your help. Regards Lars Vonk
        Hide
        Simon Kitching added a comment -

        The problems with reusing a Digester instance aren't related to multi-thread
        synchronization. The problem is that the Digester class has quite a few member
        variables whose values "evolve" as SAX events are received during a parse.

        When reusing the Digester instance, you therefore need to ensure that all those
        members are reset back to their initial states before the second parse begins.

        The "clear()" method makes a stab at resetting these, but it is actually rather
        a difficult problem. I personally wouldn't ever reuse a Digester instance in
        production code. However I know that some people do.

        If you are determined to reuse Digester instances, then at the least you should
        call the clear() method before each parse. This method is called automatically
        when the SAX event "endDocument" is received; however when a parse fails
        part-way through, this event is never generated so the clear() method is never
        invoked. This is probably the cause of the problem you are experiencing right now.

        Can you add a call to digester.clear() into your code and see if it resolves
        your problem?

        Show
        Simon Kitching added a comment - The problems with reusing a Digester instance aren't related to multi-thread synchronization. The problem is that the Digester class has quite a few member variables whose values "evolve" as SAX events are received during a parse. When reusing the Digester instance, you therefore need to ensure that all those members are reset back to their initial states before the second parse begins. The "clear()" method makes a stab at resetting these, but it is actually rather a difficult problem. I personally wouldn't ever reuse a Digester instance in production code. However I know that some people do. If you are determined to reuse Digester instances, then at the least you should call the clear() method before each parse. This method is called automatically when the SAX event "endDocument" is received; however when a parse fails part-way through, this event is never generated so the clear() method is never invoked. This is probably the cause of the problem you are experiencing right now. Can you add a call to digester.clear() into your code and see if it resolves your problem?
        Hide
        lars added a comment -

        Thanks for the quick reply.
        I indeed reuse the Digester instance for performance reasons, but the parse
        method is synchronized (can i make from your story that this not help?). While
        developing the creation of the Digester took a lot of processing time so i
        reused it. Can you give a estimate in percentage how much time we gain if the
        rules are created as you described?

        Attached you will find the following files (i had to altered some names because
        of privacy rules of the company i work for.)

        • create.java - code snippet how i create the Digester. I made a sort of
          wrapper class around it to make sure parsing is done synchronized.
        • rules.xml - Rule file
        • Script-1.xml = testscript A as descibed in my first explanation
        • Script-2.xml = testscript B as described

        Unfortunatly i cannot post the original XSD as attachement here, but the only
        thing of importance for this example is that the XSD states that the
        OrganisationRole (field Z) has a maximum of 2 positions, thats the reason why
        it gives the SAXException as descibed in my first explanation.

        If you want the XSD anyway send me an email to larsvonk@hotmail.com and i will
        send you a somewhat altered one.

        Thanks again!

        Show
        lars added a comment - Thanks for the quick reply. I indeed reuse the Digester instance for performance reasons, but the parse method is synchronized (can i make from your story that this not help?). While developing the creation of the Digester took a lot of processing time so i reused it. Can you give a estimate in percentage how much time we gain if the rules are created as you described? Attached you will find the following files (i had to altered some names because of privacy rules of the company i work for.) create.java - code snippet how i create the Digester. I made a sort of wrapper class around it to make sure parsing is done synchronized. rules.xml - Rule file Script-1.xml = testscript A as descibed in my first explanation Script-2.xml = testscript B as described Unfortunatly i cannot post the original XSD as attachement here, but the only thing of importance for this example is that the XSD states that the OrganisationRole (field Z) has a maximum of 2 positions, thats the reason why it gives the SAXException as descibed in my first explanation. If you want the XSD anyway send me an email to larsvonk@hotmail.com and i will send you a somewhat altered one. Thanks again!
        Hide
        lars added a comment -

        Created an attachment (id=11991)
        Tests and code snippets

        Show
        lars added a comment - Created an attachment (id=11991) Tests and code snippets
        Hide
        Simon Kitching added a comment -

        It does indeed sound like a strange problem, and one that will be very difficult
        to diagnose unless you can provide code which can demonstrate the problem.

        Are you using the same Digester instance to parse multiple input files, or are
        you creating a new Digester instance each time? I would recommend against
        trying to reuse a digester object.

        If you are concerned about performance then you might want to create a RulesBase
        object, add rule instances to it, then reuse that object but create a separate
        Digester object each time. The reason is that the Digester object's main purpose
        is to maintain state during SAX event-based parsing. When this is rudely
        interrupted by an exception, the state may get a bit messy, preventing a clean
        reset before the next parse. The Rule and RulesBase classes, on the other hand,
        are not supposed to be modified by the parsing, ie are "stateless" with respect
        to the sax parsing stage and so are pretty safe to reuse.

        Show
        Simon Kitching added a comment - It does indeed sound like a strange problem, and one that will be very difficult to diagnose unless you can provide code which can demonstrate the problem. Are you using the same Digester instance to parse multiple input files, or are you creating a new Digester instance each time? I would recommend against trying to reuse a digester object. If you are concerned about performance then you might want to create a RulesBase object, add rule instances to it, then reuse that object but create a separate Digester object each time. The reason is that the Digester object's main purpose is to maintain state during SAX event-based parsing. When this is rudely interrupted by an exception, the state may get a bit messy, preventing a clean reset before the next parse. The Rule and RulesBase classes, on the other hand, are not supposed to be modified by the parsing, ie are "stateless" with respect to the sax parsing stage and so are pretty safe to reuse.

          People

          • Assignee:
            Unassigned
            Reporter:
            lars
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development