1. Xerces-C++
  2. XERCESC-1977

XMLUTF8Transcoder split surrogates causes Xalan to run out of memory


    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 2.7.0
    • Fix Version/s: 2.7.0
    • Component/s: Utilities
    • Labels:
    • Environment:
      Debian Linux 2.4.27


      When a surrogate pair is split at an input buffer boundary so only the high surrogate is the last code point in the buffer when transcode() is called, then XMLUTF8Transcoder succeeds but does not consume the entire buffer. This causes Xalan to loop infinitely, doubling the buffer each time, leading to an out of memory failure. The full backtrace is below.

      I will attach a patch that changes XMLUTF8Transcoder so that it saves the dangling high surrogate in this case and uses it when transcode() is called again.

      (gdb) Primary backtrace
      (gdb) =================
      (gdb) #0 0xf6da37c7 in raise () from /lib/tls/libc.so.6
      #1 0xf6da4f49 in abort () from /lib/tls/libc.so.6
      #2 0xf6f99994 in _gnu_cxx::_verbose_terminate_handler() () from
      #3 0xf6f973b5 in ?? () from /usr/lib/libstdc++.so.6
      #4 0xf6f973f2 in std::terminate() () from /usr/lib/libstdc++.so.6
      #5 0xf6f9752a in __cxa_throw () from /usr/lib/libstdc++.so.6
      #6 0xf76eee1d in xercesc_2_7::MemoryManagerImpl::allocate(unsigned int) ()
      from /usr/lib/libxerces-c.so.27
      #7 0xf724cd1a in xalanc_1_10::XalanOutputStream::transcode(unsigned short
      const*, unsigned int, xalanc_1_10::XalanVector<char,
      xalanc_1_10::MemoryManagedConstructionTraits<char> >&) () from
      #8 0xf724d1af in xalanc_1_10::XalanOutputStream::doWrite(unsigned short
      const*, unsigned int) () from /usr/lib/libxalan-c.so.110
      #9 0xf724d213 in xalanc_1_10::XalanOutputStream::flushBuffer() () from
      #10 0xf724eec8 in xalanc_1_10::XalanOutputStreamPrintWriter::write(unsigned
      short) () from /usr/lib/libxalan-c.so.110
      #11 0xf7268e6a in xalanc_1_10::FormatterToText::characters(unsigned short
      const*, unsigned int) () from /usr/lib/libxalan-c.so.110
      #12 0xf7418674 in xalanc_1_10::XSLTEngineImpl::characters(unsigned short
      const*, unsigned int, unsigned int) () from /usr/lib/libxalan-c.so.110
      #13 0xf73d0ce8 in
      xalanc_1_10::StylesheetExecutionContextDefault::characters(unsigned short
      const*, unsigned int, unsigned int) () from /usr/lib/libxalan-c.so.110
      #14 0xf737c5bb in xalanc_1_10::FormatterListenerAdapater::characters(unsigned
      short const*, unsigned int) () from /usr/lib/libxalan-c.so.110
      #15 0xf725484d in
      xalanc_1_10::DOMServices::getNodeData(xalanc_1_10::XalanElement const&,
      xalanc_1_10::FormatterListener&, void
      (xalanc_1_10::FormatterListener::)(unsigned short const, unsigned int)) ()
      from /usr/lib/libxalan-c.so.110
      #16 0xf7254b63 in xalanc_1_10::DOMServices::getNodeData(xalanc_1_10::XalanNode
      const&, xalanc_1_10::FormatterListener&, void
      (xalanc_1_10::FormatterListener::)(unsigned short const, unsigned int)) ()
      from /usr/lib/libxalan-c.so.110
      #17 0xf72a3947 in
      xalanc_1_10::XNodeSetBase::str(xalanc_1_10::FormatterListener&, void
      (xalanc_1_10::FormatterListener::)(unsigned short const, unsigned int)) const
      () from /usr/lib/libxalan-c.so.110
      #18 0xf72be80e in xalanc_1_10::XPath::executeMore(xalanc_1_10::XalanNode*, int
      const*, xalanc_1_10::XPathExecutionContext&, xalanc_1_10::FormatterListener&,
      void (xalanc_1_10::FormatterListener::)(unsigned short const, unsigned int))
      const () from /usr/lib/libxalan-c.so.110
      #19 0xf737c377 in
      const () from /usr/lib/libxalan-c.so.110
      #20 0xf73786ac in
      const () from /usr/lib/libxalan-c.so.110
      #21 0xf73f9f92 in xalanc_1_10::StylesheetRoot::process(xalanc_1_10::XalanNode*,
      xalanc_1_10::XSLTResultTarget&, xalanc_1_10::StylesheetExecutionContext&) const
      () from /usr/lib/libxalan-c.so.110
      #22 0xf740f234 in
      xalanc_1_10::XSLTEngineImpl::process(xalanc_1_10::XSLTInputSource const&,
      xalanc_1_10::XSLTResultTarget&, xalanc_1_10::StylesheetExecutionContext&) ()
      from /usr/lib/libxalan-c.so.110
      #23 0xf744401f in
      const&, xalanc_1_10::XalanCompiledStylesheet const*,
      xalanc_1_10::XSLTInputSource const*, xalanc_1_10::XSLTResultTarget const&) ()
      from /usr/lib/libxalan-c.so.110
      #24 0xf744674b in
      xalanc_1_10::XalanTransformer::transform(xalanc_1_10::XSLTInputSource const&,
      xalanc_1_10::XalanCompiledStylesheet const*, xalanc_1_10::XSLTResultTarget
      const&) () from /usr/lib/libxalan-c.so.110

      1. xercesc-1977.patch
        3 kB
        David K. Taylor


        No work has yet been logged on this issue.


          • Assignee:
            David K. Taylor
          • Votes:
            1 Vote for this issue
            1 Start watching this issue


            • Created: