Uploaded image for project: 'Xerces-C++'
  1. Xerces-C++
  2. XERCESC-682

Performance problem with large text nodes and XMLFormatter.cpp

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Resolution: Fixed
    • 2.1.0
    • 2.2.0
    • Non-Validating Parser
    • None
    • Operating System: Windows NT/2K
      Platform: PC
    • 13695

    Description

      I found a performance problem with large text nodes in
      XMLFormatter.cpp::formatBuff(). My node is actually 6MB of base64 encoded
      binary data. The code searches the buffer for escape sequences and doesn't find
      any (since it is base64 data). Then it goes into an if statement which is
      supposed to pass all the data it just checked through the transcoder. The
      problem is that it only does one buffer size (about 16K), then loops around and
      starts over, checking 6MB - 16K for escape sequences again. I added a while
      statement inside the if statment and performance was improved by an order of
      magnitude (easily). Here is the patch:

      — XMLFormatter.cpp.old 2002-10-16 10:47:38.000000000 -0400
      +++ XMLFormatter.cpp 2002-10-16 10:47:49.000000000 -0400
      @@ -358,38 +358,42 @@
      //
      if (tmpPtr > srcPtr)
      {

      • const unsigned int srcCount = tmpPtr - srcPtr;
      • const unsigned srcChars = srcCount > kTmpBufSize ?
      • kTmpBufSize : srcCount;
        +
        + while ( tmpPtr > srcPtr )
        + {
        + const unsigned int srcCount = tmpPtr - srcPtr;
        + const unsigned srcChars = srcCount > kTmpBufSize ?
        + kTmpBufSize : srcCount;
      • const unsigned int outBytes = fXCoder->transcodeTo
      • (
      • srcPtr
      • , srcChars
      • , fTmpBuf
      • , kTmpBufSize
      • , charsEaten
      • , unRepOpts
      • );
        + const unsigned int outBytes = fXCoder->transcodeTo
        + (
        + srcPtr
        + , srcChars
        + , fTmpBuf
        + , kTmpBufSize
        + , charsEaten
        + , unRepOpts
        + );
      • #if defined(XML_DEBUG)
      • if ((outBytes > kTmpBufSize)
      • (charsEaten > srcCount))
      • { - // <TBD> The transcoder is freakin out maaaannn - }
      • #endif
        + #if defined(XML_DEBUG)
        + if ((outBytes > kTmpBufSize)
        + || (charsEaten > srcCount))
        + { + // <TBD> The transcoder is freakin out maaaannn + }

        + #endif

      • // If we get any bytes out, then write them
      • if (outBytes)
      • { - fTmpBuf[outBytes] = 0; fTmpBuf[outBytes + 1] = 0; - fTmpBuf[outBytes + 2] = 0; fTmpBuf[outBytes + 3] = 0; - fTarget->writeChars(fTmpBuf, outBytes, this); - }

        + // If we get any bytes out, then write them
        + if (outBytes)
        +

        { + fTmpBuf[outBytes] = 0; fTmpBuf[outBytes + 1] = 0; + fTmpBuf[outBytes + 2] = 0; fTmpBuf[outBytes + 3] = 0; + fTarget->writeChars(fTmpBuf, outBytes, this); + }
      • // And bump up our pointer
      • srcPtr += charsEaten;
        + // And bump up our pointer
        + srcPtr += charsEaten;
        + }
        }
        else if (tmpPtr < endPtr)
        {

      Attachments

        Activity

          People

            Unassigned Unassigned
            spamnps+apachebugzilla@phoenix-int.com Nathan Sharp
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: