Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-4799

Enable extraction of originating term for ICU collation keys

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: 4.1
    • Fix Version/s: None
    • Component/s: core/other
    • Labels:
    • Lucene Fields:
      New, Patch Available

      Description

      By concatenating generated ICU collation keys bytes with the originating term, it is possible to extract the originating term at a later time. This makes it possible to build a collator sorted facet field and similar multi-value/document structures.

      ICU collation keys are guaranteed to be terminated by a 0 (https://ssl.icu-project.org/apiref/icu4j48rc1/com/ibm/icu/text/CollationKey.html) and since comparison of keys stop when a 0 is encountered, the addition of the originating term does not affect sort order. As 0 are only used for termination in the key bytes, the extraction of the originating term is unambiguous.

        Attachments

        1. LUCENE-4799.patch
          14 kB
          Toke Eskildsen

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              toke Toke Eskildsen
            • Votes:
              1 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: