Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-4799

Enable extraction of originating term for ICU collation keys

Details

    • Improvement
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 4.1
    • None
    • core/other
    • New, Patch Available

    Description

      By concatenating generated ICU collation keys bytes with the originating term, it is possible to extract the originating term at a later time. This makes it possible to build a collator sorted facet field and similar multi-value/document structures.

      ICU collation keys are guaranteed to be terminated by a 0 (https://ssl.icu-project.org/apiref/icu4j48rc1/com/ibm/icu/text/CollationKey.html) and since comparison of keys stop when a 0 is encountered, the addition of the originating term does not affect sort order. As 0 are only used for termination in the key bytes, the extraction of the originating term is unambiguous.

      Attachments

        1. LUCENE-4799.patch
          14 kB
          Toke Eskildsen

        Activity

          People

            Unassigned Unassigned
            toke Toke Eskildsen
            Votes:
            1 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: