Uploaded image for project: 'Xerces-C++'
  1. Xerces-C++
  2. XERCESC-2094

Memory leak related to invalid encoding

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 3.1.0, 3.1.1, 3.1.2, 3.1.3, 3.1.4
    • 3.2.0
    • None
    • None
    • Probably all. In that case Ubuntu 16.04 x86_64

    Description

      Issue originally found through OSS-Fuzz on GDAL ( for reference https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=1685 : the link will not be publicly accessible until 90 days have passe), but can be reproduced with Xerces-C SAX2Count utility.

      On the attached file, Valgrind reports a memory leak:

      The content of the file is:
      {{{
      <?xml[newline character]
      version="1.0" encoding="U"?><foo xmlns="http://schemas.opengis.net/gml"/>
      }}}

      valgrind --leak-check=full /home/even/install-xerces-c-3.1.4/bin/SAX2Count xerces-c-leak.xml
      ==21268== Memcheck, a memory error detector
      ==21268== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
      ==21268== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
      ==21268== Command: /home/even/install-xerces-c-3.1.4/bin/SAX2Count /home/even/gdal/trunk/gdal/xerces-c-leak.xml
      ==21268==

      Fatal Error at file /home/even/gdal/trunk/gdal/xerces-c-leak.xml, line 1, char 35
      Message: unable to create converter for 'U' encoding
      ==21268==
      ==21268== HEAP SUMMARY:
      ==21268== in use at exit: 76,348 bytes in 10 blocks
      ==21268== total heap usage: 9,244 allocs, 9,234 frees, 1,282,907 bytes allocated
      ==21268==
      ==21268== 52 (40 direct, 12 indirect) bytes in 1 blocks are definitely lost in loss record 4 of 10
      ==21268== at 0x4C2E0EF: operator new(unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
      ==21268== by 0x4FF9B58: xercesc_3_1::MemoryManagerImpl::allocate(unsigned long) (MemoryManagerImpl.cpp:40)
      ==21268== by 0x4F7EE05: xercesc_3_1::XMemory::operator new(unsigned long, xercesc_3_1::MemoryManager*) (XMemory.cpp:68)
      ==21268== by 0x4F7E660: xercesc_3_1::ENameMapFor<xercesc_3_1::XMLUTF8Transcoder>::makeNew(unsigned long, xercesc_3_1::MemoryManager*) const (TransENameMap.c:50)
      ==21268== by 0x4F7AF20: xercesc_3_1::XMLTransService::makeNewTranscoderFor(unsigned short const*, xercesc_3_1::XMLTransService::Codes&, unsigned long, xercesc_3_1::MemoryManager*) (TransService.cpp:147)
      ==21268== by 0x5010A75: xercesc_3_1::XMLReader::refreshCharBuffer() (XMLReader.cpp:523)
      ==21268== by 0x4FFA5AA: peekNextChar (XMLReader.hpp:767)
      ==21268== by 0x4FFA5AA: xercesc_3_1::ReaderMgr::peekNextChar() (ReaderMgr.cpp:158)
      ==21268== by 0x5016297: xercesc_3_1::XMLScanner::scanProlog() (XMLScanner.cpp:1238)
      ==21268== by 0x4FEE371: xercesc_3_1::IGXMLScanner::scanDocument(xercesc_3_1::InputSource const&) (IGXMLScanner.cpp:206)
      ==21268== by 0x5017E6D: xercesc_3_1::XMLScanner::scanDocument(unsigned short const*) (XMLScanner.cpp:400)
      ==21268== by 0x5018221: xercesc_3_1::XMLScanner::scanDocument(char const*) (XMLScanner.cpp:408)
      ==21268== by 0x5044F47: xercesc_3_1::SAX2XMLReaderImpl::parse(char const*) (SAX2XMLReaderImpl.cpp:451)
      ==21268==
      ==21268== LEAK SUMMARY:
      ==21268== definitely lost: 40 bytes in 1 blocks
      ==21268== indirectly lost: 12 bytes in 1 blocks
      ==21268== possibly lost: 0 bytes in 0 blocks
      ==21268== still reachable: 76,296 bytes in 8 blocks
      ==21268== suppressed: 0 bytes in 0 blocks
      ==21268== Reachable blocks (those to which a pointer was found) are not shown.
      ==21268== To see them, rerun with: --leak-check=full --show-leak-kinds=all
      ==21268==
      ==21268== For counts of detected and suppressed errors, rerun with: -v
      ==21268== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

      I've found that the leak occurs only if the following conditions are met: there is a newline character between <?xml and version="1.0" and the value of the encoding attribute is a invalid encoding name.

      Attachments

        1. xerces-c-leak.xml
          0.1 kB
          Even Rouault

        Activity

          People

            scantor Scott Cantor
            rouault Even Rouault
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: