Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-2019

map unicode process-internal codepoints to replacement character

Details

    • Improvement
    • Status: Reopened
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • core/index
    • None
    • New

    Description

      A spinoff from LUCENE-2016.

      There are several process-internal codepoints in unicode, we should not store these in the index.
      Instead they should be mapped to replacement character (U+FFFD), so they can be used process-internally.

      An example of this is how Lucene Java currently uses U+FFFF process-internally, it can't be in the index or will cause problems.

      Attachments

        1. LUCENE-2019.patch
          1 kB
          Robert Muir

        Activity

          People

            Unassigned Unassigned
            rcmuir Robert Muir
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated: