Uploaded image for project: 'Xerces-C++'
  1. Xerces-C++
  2. XERCESC-1207

XMLScanner::scanCharData fills XMLBuffer until out of memory

VotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 2.5.0
    • None
    • Non-Validating Parser
    • None

    Description

      When parsing an XML file consisting primarily of very large (hundreds of megabytes) blocks of contiguous character data, XMLScanner::scanCharData() happily attempts to build a single XMLBuffer containing all the data. Eventually the buffer becomes so large that the reallocation within XMLBuffer::insureCapacity() fails, causing std::bad_alloc to be thrown, or a crash in memcpy (depending on compiler). The fundamental problem seems to be that there is no upper bound imposed on buffer length.

      In the SAX model, it is acceptable to issue multiple ContentHandler::characters() callbacks for a single contiguous block of data. The only restriction on how this should be implemented is that all characters in any single event must come from the same external entity; no further behavior is specified. So it would be perfectly conformant to the SAX model to set an upper bound on the size of a single characters() event.

      (As far as I understand, allowing an upper bound in XMLScanner::scanCharData() would not affect the DOM)

      I'd propose that an upper bound for character buffer size be added as an optional parameter (with some reasonable value as a default), either in the constructor of the parser or in useScanner(), and that that parameter be used to inform XMLScanner::scanCharData() when to force a call to sendCharData() to dump the buffer to its client.

      Attachments

        1. inputbuffersize
          14 kB
          Dan Rosen
        2. inputbuffersize
          13 kB
          Dan Rosen

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            dr Dan Rosen
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment