Uploaded image for project: 'Xerces-C++'
  1. Xerces-C++
  2. XERCESC-1955

Xerces is poping up exception while parsing a Unicode file, but same is working fine for an ANSI file

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Cannot Reproduce
    • 3.1.0
    • 3.1.0
    • DOM
    • None
    • Windows XP 32Bit
      Windows7 64bit

    Description

      Hi All,

      Please let me know, if anybody can provide some clue on this.

      I have been using Xerces as XML parser in my C++ application and I have recently migrated my Xerces version from 1.3 (very old) to 3.1.

      After that, when I call AbstractDOMParser::parse(const xercesc_3_1::InputSource & source=

      {...}) and passing a Unicode file as input, it pops up exception. However the same works ok for ANSI.

      The call stack is as shown below.

      xerces-c_3_1.dll!xercesc_3_1::XMLScanner::scanProlog() Line 1227 + 0x25 bytes
      xerces-c_3_1.dll!xercesc_3_1::IGXMLScanner::scanDocument(const xercesc_3_1::InputSource & src={...}

      ) Line 210
      xerces-c_3_1.dll!xercesc_3_1::AbstractDOMParser::parse(const xercesc_3_1::InputSource & source=

      {...}

      ) Line 549
      EPConfigTool.dll!XCfgXMLParser::parse() Line 66 - <b>My application code</b>

      In the code, it is reaching at
      else
      {
      emitError(XMLErrs::InvalidDocumentStructure);
      ...
      }

      The function at parse fail is as shown below:

      void XMLScanner::scanProlog()
      {
      bool sawDocTypeDecl = false;
      // Get a buffer for whitespace processing
      XMLBufBid bbCData(&fBufMgr);

      // Loop through the prolog. If there is no content, this could go all
      // the way to the end of the file.
      try
      {
      while (true)
      {
      const XMLCh nextCh = fReaderMgr.peekNextChar();

      if (nextCh == chOpenAngle)
      {
      // Ok, it could be the xml decl, a comment, the doc type line,
      // or the start of the root element.
      if (checkXMLDecl(true))
      {
      // There shall be at lease -ONE- space in between
      // the tag '<?xml' and the VersionInfo.
      //
      // If we are not at line 1, col 6, then the decl was not
      // the first text, so its invalid.
      const XMLReader* curReader = fReaderMgr.getCurrentReader();
      if ((curReader->getLineNumber() != 1)

      (curReader->getColumnNumber() != 7)) { emitError(XMLErrs::XMLDeclMustBeFirst); }

      scanXMLDecl(Decl_XML);
      }
      else if (fReaderMgr.skippedString(XMLUni::fgPIString))

      { scanPI(); }

      else if (fReaderMgr.skippedString(XMLUni::fgCommentString))

      { scanComment(); }

      else if (fReaderMgr.skippedString(XMLUni::fgDocTypeString))
      {
      if (sawDocTypeDecl)

      { emitError(XMLErrs::DuplicateDocTypeDecl); }

      scanDocTypeDecl();
      sawDocTypeDecl = true;

      // if reusing grammar, this has been validated already in first scan
      // skip for performance
      if (fValidate && fGrammar && !fGrammar->getValidated())

      { // validate the DTD scan so far fValidator->preContentValidation(fUseCachedGrammar, true); }

      }
      else

      { // Assume its the start of the root element return; }

      }
      else if (fReaderMgr.getCurrentReader()->isWhitespace(nextCh))
      {
      // If we have a document handler then gather up the
      // whitespace and call back. Otherwise just skip over spaces.
      if (fDocHandler)

      { fReaderMgr.getSpaces(bbCData.getBuffer()); fDocHandler->ignorableWhitespace ( bbCData.getRawBuffer() , bbCData.getLen() , false ); }

      else

      { fReaderMgr.skipPastSpaces(); }

      }
      else

      { emitError(XMLErrs::InvalidDocumentStructure); // Watch for end of file and break out if (!nextCh) break; else fReaderMgr.skipPastChar(chCloseAngle); }

      }
      }
      catch(const EndOfEntityException&)

      { // We should never get an end of entity here. They should only // occur within the doc type scanning method, and not leak out to // here. emitError ( XMLErrs::UnexpectedEOE , "in prolog" ); }

      }

      It is working fine when I move back to version 1.3, but due to various other requirements, I have to use the new version 3.1 in my application.

      Thanks in advance,
      Jojo

      Attachments

        1. MyXML.xml
          43 kB
          Jojo Jose

        Activity

          People

            Unassigned Unassigned
            jojo.jose Jojo Jose
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 20h
                20h
                Remaining:
                Remaining Estimate - 20h
                20h
                Logged:
                Time Spent - Not Specified
                Not Specified