Uploaded image for project: 'XalanC'
  1. XalanC
  2. XALANC-743

XalanOutputStream::transcode falls into infinite loop on 4 bytes unicode till out of memory

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.10
    • None
    • XalanC
    • None
    • Linux

    Description

      In some rare cases, XalanTransformer::transform would stuck or crash when the input/stylesheet contains 4-byte unicode. And I traced down the root cause in XalanOutputStream::transcode

      When the transcode buffer contains unicode of size 4 bytes, and the last XalanDOMChar in the buffer is the first 2 bytes of a 4-byte unicode char. The XalanOutputStream::transcode will fall into an infinite loop till it is out of memory. As XMLUTF8Transcoder.cpp in xerces will not consume the last 2-bytes if it is part of 4 byte unicode. And transcode always loop until all chars in the buffer is eaten. Specifically this will happen when the last XalanDOMChar in the input buffer is between 0xD800 and 0xDBFF.

      I cannot find whether this issue has been reported before. This is version 1.10. I do have a fix to add a bool reference to the function, so that the caller can push the last 2 byte back to the buffer if not consumed. But want to check it out before submit any fixes.

      Attachments

        Issue Links

          Activity

            People

              shathaway Steven J. Hathaway
              jfan Jiangbei Fan
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: