Uploaded image for project: 'Xerces-C++'
  1. Xerces-C++
  2. XERCESC-1727

On Linux "Input data transcoding error" message does not contain invalid character if no proper locale is set

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 2.7.0
    • 3.0.1
    • Utilities
    • None
    • Linux, both 32- and 64-bit Red Hats, probably others

    Description

      The test case involved russian letter put into XML file. Transcoding crashed with "Input data transcoding error..." message but no symbol displayed. I've traced the problem to memory allocation for a single character being transcoded. The size of a character in Linux implementation is determined through 'mblen' in calcRequiredSize(..) but it seems nothing more than plain 7-bit ASCII is acceptable.

      Here is a source code snippet based on original calcRequiredSize fed with a character I used:

      char sExp[2]=

      {'\192','\0'}

      ;
      // the line below "fixes" the case: the character (russian 'A') is shown within exception message
      // setlocale(LC_ALL,"Russian");
      int iLen=std::mblen(&sExp[0],MB_CUR_MAX);
      if(-1 == iLen)

      { // ERROR! we should not be here, since no allocation will be done -- and that's what I get throw 1; }

      Other platforms (Win32, Solaris etc) worked out fine.

      Attachments

        Activity

          People

            Unassigned Unassigned
            muller Sergey Melnikov
            Votes:
            5 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: