Uploaded image for project: 'Harmony'
  1. Harmony
  2. HARMONY-6640

UTF8 decoder doesn't properly decode supplementary characters

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 5.0M14
    • 5.0M16
    • Classlib
    • None
    • Windows Vista
    • Patch Available

    Description

      When attempting to build Lucene, I discovered a problem with UTF8 decoding.
      (this actually prevents our tests from even compiling without a workaround)

      For any codepoint > 0xffff (4-byte utf-8 sequence), the decoder doesn't properly
      split the decoded codepoint into surrogate pairs.

      Attachments

        1. HARMONY-6640.patch
          1 kB
          Robert Muir
        2. HARMONY-6640.patch
          2 kB
          Robert Muir
        3. nio_char.jar
          1.34 MB
          Mark Hindess

        Activity

          People

            tellison Tim Ellison
            rcmuir Robert Muir
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: