Uploaded image for project: 'Xerces-C++'
  1. Xerces-C++
  2. XERCESC-1361

CRLF is translated to LF in scanCharData

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Invalid
    • 2.6.0
    • None
    • SAX/SAX2
    • None
    • win2k, Xerces-c 2.6(build the src with vc6+sp5) and Xerces-c 2.1 binary version

    Description

      When i parse a simple xml document. there is a CRLF between aaa and bbb. But saxparse call method characters, the string is translated to aaa LF bbb. It loses the char CR.

      <?xml version="1.0" encoding="gb2312" standalone="no"?>
      <dd><ddrow><text>aaa
      bbb</text>
      </ddrow></dd>

      And i trace the code, i find the char is eated up by handleEOL. I want keep the content unchanged. Is it reasonable? Thanks.

      The call stack
      xercesc_2_6::XMLReader::handleEOL(unsigned short & 0x000d, unsigned char 0x00) line 898
      xercesc_2_6::XMLReader::getNextCharIfNot(const unsigned short 0x003c, unsigned short & 0x000d) line 789
      xercesc_2_6::ReaderMgr::getNextCharIfNot(const unsigned short 0x003c, unsigned short & 0x000d) line 398
      xercesc_2_6::IGXMLScanner::scanCharData(xercesc_2_6::XMLBuffer &

      {...}) line 2630 + 17 bytes
      xercesc_2_6::IGXMLScanner::scanContent() line 837
      xercesc_2_6::IGXMLScanner::scanDocument(const xercesc_2_6::InputSource & {...}

      ) line 204 + 8 bytes
      xercesc_2_6::SAXParser::parse(const xercesc_2_6::InputSource &

      {...}

      ) line 720

      internal\XMLReader.hpp Ln895
      if ( fCharBuf[fCharIndex] == chLF ||
      ((fCharBuf[fCharIndex] == chNEL) && fNEL) )

      { fCharIndex++; }

      Attachments

        Activity

          People

            Unassigned Unassigned
            dingkevin ding hua
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: