Lucene - Core
  1. Lucene - Core
  2. LUCENE-4806

change FacetIndexingParams.DEFAULT_FACET_DELIM_CHAR to U+001F

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.2, 6.0
    • Component/s: modules/facet
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      The current delim char takes 3 bytes as UTF-8 ... but U+001F (= INFORMATION_SEPARATOR, which seems appropriate) takes only 1 byte.

      1. LUCENE-4806.patch
        3 kB
        Michael McCandless

        Activity

        Hide
        Shai Erera added a comment -

        Cool. Just note that under back-compat, with the previous char used, so that whoever doesn't want to re index, can use the old char. Also, note that the taxonomy index uses the same char, but under a different setting (look in class Consts, package-private). Althoughy the two don't need to be in sync, it might be good to change it there too, so that the taxonomy is even smaller, as well as its caches might be smaller.

        Show
        Shai Erera added a comment - Cool. Just note that under back-compat, with the previous char used, so that whoever doesn't want to re index, can use the old char. Also, note that the taxonomy index uses the same char, but under a different setting (look in class Consts, package-private). Althoughy the two don't need to be in sync, it might be good to change it there too, so that the taxonomy is even smaller, as well as its caches might be smaller.
        Hide
        Michael McCandless added a comment -

        Simple patch, with note about the back compat break. I think it's ready!

        Show
        Michael McCandless added a comment - Simple patch, with note about the back compat break. I think it's ready!
        Hide
        Commit Tag Bot added a comment -

        [trunk commit] Michael McCandless
        http://svn.apache.org/viewvc?view=revision&revision=1451578

        LUCENE-4806: change facet delim character to use 3 bytes instead of 1 (in UTF-8)

        Show
        Commit Tag Bot added a comment - [trunk commit] Michael McCandless http://svn.apache.org/viewvc?view=revision&revision=1451578 LUCENE-4806 : change facet delim character to use 3 bytes instead of 1 (in UTF-8)
        Hide
        Commit Tag Bot added a comment -

        [branch_4x commit] Michael McCandless
        http://svn.apache.org/viewvc?view=revision&revision=1451579

        LUCENE-4806: change facet delim character to use 3 bytes instead of 1 (in UTF-8)

        Show
        Commit Tag Bot added a comment - [branch_4x commit] Michael McCandless http://svn.apache.org/viewvc?view=revision&revision=1451579 LUCENE-4806 : change facet delim character to use 3 bytes instead of 1 (in UTF-8)
        Hide
        Uwe Schindler added a comment -

        Closed after release.

        Show
        Uwe Schindler added a comment - Closed after release.

          People

          • Assignee:
            Michael McCandless
            Reporter:
            Michael McCandless
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development