Details
-
Improvement
-
Status: Reopened
-
Minor
-
Resolution: Unresolved
-
None
-
None
-
None
-
New
Description
A spinoff from LUCENE-2016.
There are several process-internal codepoints in unicode, we should not store these in the index.
Instead they should be mapped to replacement character (U+FFFD), so they can be used process-internally.
An example of this is how Lucene Java currently uses U+FFFF process-internally, it can't be in the index or will cause problems.