Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 2.3.0
    • Fix Version/s: 3.0.0
    • Labels:
      None
    • Environment:
      Operating System: Windows NT/2K
      Platform: PC

      Description

      Parser crashes in ContentSpecNode.hpp: ContentSpecNode::~ContentSpecNode().

      Steps to reproduce:
      validate a xml file against a schema with an element having a maxOccurs >=
      200000.

      Assumed cause:
      Stack overfow

      Makeshift resolution:
      Set the repeat count to unbounded(-1), when maxOccurs > 500:

      inline void ContentSpecNode::setMaxOccurs(int max)
      {
      if(max > 500)
      max = -1;
      fMaxOccurs = max;
      }

        Activity

        Hide
        Christian Charbula added a comment -

        Well, that's really a big problem to me.

        Since I've to implement a solution which was given from the UNO, about 30 goverments and some hundret credit instutes, it's really not possible to me to change the maxOccurs.
        The maxOccurs is a very important limitation in this validation. (500 is to low by far)
        It's not possible to set/overrule the maxOccurs to another value.

        I need to use the Xerces-C++ parser, since it's also available for some 'exotic'-operating systems. (like OpenVMS)

        Please set a higher priority to this issue.
        Why is there a recursion needed? Can't it be implemented in a counting solution?
        Thanks,
        Christian

        Show
        Christian Charbula added a comment - Well, that's really a big problem to me. Since I've to implement a solution which was given from the UNO, about 30 goverments and some hundret credit instutes, it's really not possible to me to change the maxOccurs. The maxOccurs is a very important limitation in this validation. (500 is to low by far) It's not possible to set/overrule the maxOccurs to another value. I need to use the Xerces-C++ parser, since it's also available for some 'exotic'-operating systems. (like OpenVMS) Please set a higher priority to this issue. Why is there a recursion needed? Can't it be implemented in a counting solution? Thanks, Christian
        Hide
        Frank Rast added a comment -

        I would also like to see a higher priority.

        Can someone estimate when this issue will be solved?
        I would prefer a counting solution too, but everything else is also better than a crash.

        Any thougths?

        Regards,
        Frank

        Show
        Frank Rast added a comment - I would also like to see a higher priority. Can someone estimate when this issue will be solved? I would prefer a counting solution too, but everything else is also better than a crash. Any thougths? Regards, Frank
        Hide
        David Bertoni added a comment -

        Well, since this is an open source project, either of you could contribute. Perhaps you could investigate and work on the it?

        Show
        David Bertoni added a comment - Well, since this is an open source project, either of you could contribute. Perhaps you could investigate and work on the it?
        Hide
        Igor Sakhnov added a comment -

        Change maxOccurs does not help, try following schema and data:
        <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
        <xs:element name="season">
        <xs:complexType>
        <xs:sequence minOccurs="100" maxOccurs="300">
        <xs:element name="spring" minOccurs="10" maxOccurs="99"/>
        <xs:element name="summer" minOccurs="10" maxOccurs="99"/>
        </xs:sequence>
        </xs:complexType>
        </xs:element>
        </xs:schema>

        <season>
        <spring>cool</spring>
        <summer>warm</summer>
        </season>

        Any suggestion on what can be tweaked? And yes, our developers submitted varios fixes for different items and as far as I know there was no feedback

        Show
        Igor Sakhnov added a comment - Change maxOccurs does not help, try following schema and data: <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="season"> <xs:complexType> <xs:sequence minOccurs="100" maxOccurs="300"> <xs:element name="spring" minOccurs="10" maxOccurs="99"/> <xs:element name="summer" minOccurs="10" maxOccurs="99"/> </xs:sequence> </xs:complexType> </xs:element> </xs:schema> <season> <spring>cool</spring> <summer>warm</summer> </season> Any suggestion on what can be tweaked? And yes, our developers submitted varios fixes for different items and as far as I know there was no feedback
        Hide
        Karsten Strunk added a comment - - edited

        I think, I'm encountering a similar problem with Xerces 2.7.0. Xerces crashes when maxOccurs is too large.

        Xerces always allocates memory for elements according to the number specified in maxOccurs, regardless of the real number of elements in xml instance. If the real numer of elements is small, most allocated elements are not needed.

        For example:

        <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
        <xs:element name="computer">
        <xs:complexType>
        <xs:sequence>
        <xs:element name="serialNumber" maxOccurs="10000"/>
        </xs:sequence>
        </xs:complexType>
        </xs:element>
        </xs:schema>

        <computer>
        <serialNumber>1</serialNumber>
        </computer>

        In this case 10000 elements are created, but only 1 is really needed. I found the following code fragment in ComplexTypeInfo::expandContentModel:

        for (int j=1; j < maxOccurs-minOccurs; j++)

        { retNode = new (fMemoryManager) ContentSpecNode ( ContentSpecNode::Sequence , retNode , optional , true , false , fMemoryManager ); }

        In my case I have a xml schema which uses maxOccurs="9999999". So each time this element is found in a xml instance 9999999 elements are created although normally just a few are really needed. So I'd also prefer a counting version which creates only the really needed elements.

        Show
        Karsten Strunk added a comment - - edited I think, I'm encountering a similar problem with Xerces 2.7.0. Xerces crashes when maxOccurs is too large. Xerces always allocates memory for elements according to the number specified in maxOccurs, regardless of the real number of elements in xml instance. If the real numer of elements is small, most allocated elements are not needed. For example: <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="computer"> <xs:complexType> <xs:sequence> <xs:element name="serialNumber" maxOccurs="10000"/> </xs:sequence> </xs:complexType> </xs:element> </xs:schema> <computer> <serialNumber>1</serialNumber> </computer> In this case 10000 elements are created, but only 1 is really needed. I found the following code fragment in ComplexTypeInfo::expandContentModel: for (int j=1; j < maxOccurs-minOccurs; j++) { retNode = new (fMemoryManager) ContentSpecNode ( ContentSpecNode::Sequence , retNode , optional , true , false , fMemoryManager ); } In my case I have a xml schema which uses maxOccurs="9999999". So each time this element is found in a xml instance 9999999 elements are created although normally just a few are really needed. So I'd also prefer a counting version which creates only the really needed elements.
        Hide
        Boris Kolpackov added a comment -

        Tentatively changing the "fix for" version to 3.0.0. Alberto has a solution which for now does not work for all cases. He will try to get it into shape and I will try to help. If this fails then we will probably need to implement the "treat large maxOccurs as unbounded" hack for the time being.

        Show
        Boris Kolpackov added a comment - Tentatively changing the "fix for" version to 3.0.0. Alberto has a solution which for now does not work for all cases. He will try to get it into shape and I will try to help. If this fails then we will probably need to implement the "treat large maxOccurs as unbounded" hack for the time being.
        Hide
        Michael Glavassevich added a comment -

        You may want to take a look at what Peter McCracken and I did for Xerces-J. See XERCESJ-773 and XERCESJ-1267.

        Show
        Michael Glavassevich added a comment - You may want to take a look at what Peter McCracken and I did for Xerces-J. See XERCESJ-773 and XERCESJ-1267 .
        Hide
        Alberto Massari added a comment -

        The fix for Xerces-J has been backported to Xerces-C

        Show
        Alberto Massari added a comment - The fix for Xerces-J has been backported to Xerces-C
        Hide
        Frank Rast added a comment -

        Now it does not crash but the performance could be better. Large maxOccurs stil slow down validation. Is there a way to get better performance in this case?

        Show
        Frank Rast added a comment - Now it does not crash but the performance could be better. Large maxOccurs stil slow down validation. Is there a way to get better performance in this case?

          People

          • Assignee:
            Alberto Massari
            Reporter:
            Frank Rast
          • Votes:
            2 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development