Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-4806

change FacetIndexingParams.DEFAULT_FACET_DELIM_CHAR to U+001F

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.2, 6.0
    • Component/s: modules/facet
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      The current delim char takes 3 bytes as UTF-8 ... but U+001F (= INFORMATION_SEPARATOR, which seems appropriate) takes only 1 byte.

      1. LUCENE-4806.patch
        3 kB
        Michael McCandless

        Activity

        Hide
        shaie Shai Erera added a comment -

        Cool. Just note that under back-compat, with the previous char used, so that whoever doesn't want to re index, can use the old char. Also, note that the taxonomy index uses the same char, but under a different setting (look in class Consts, package-private). Althoughy the two don't need to be in sync, it might be good to change it there too, so that the taxonomy is even smaller, as well as its caches might be smaller.

        Show
        shaie Shai Erera added a comment - Cool. Just note that under back-compat, with the previous char used, so that whoever doesn't want to re index, can use the old char. Also, note that the taxonomy index uses the same char, but under a different setting (look in class Consts, package-private). Althoughy the two don't need to be in sync, it might be good to change it there too, so that the taxonomy is even smaller, as well as its caches might be smaller.
        Hide
        mikemccand Michael McCandless added a comment -

        Simple patch, with note about the back compat break. I think it's ready!

        Show
        mikemccand Michael McCandless added a comment - Simple patch, with note about the back compat break. I think it's ready!
        Hide
        commit-tag-bot Commit Tag Bot added a comment -

        [trunk commit] Michael McCandless
        http://svn.apache.org/viewvc?view=revision&revision=1451578

        LUCENE-4806: change facet delim character to use 3 bytes instead of 1 (in UTF-8)

        Show
        commit-tag-bot Commit Tag Bot added a comment - [trunk commit] Michael McCandless http://svn.apache.org/viewvc?view=revision&revision=1451578 LUCENE-4806 : change facet delim character to use 3 bytes instead of 1 (in UTF-8)
        Hide
        commit-tag-bot Commit Tag Bot added a comment -

        [branch_4x commit] Michael McCandless
        http://svn.apache.org/viewvc?view=revision&revision=1451579

        LUCENE-4806: change facet delim character to use 3 bytes instead of 1 (in UTF-8)

        Show
        commit-tag-bot Commit Tag Bot added a comment - [branch_4x commit] Michael McCandless http://svn.apache.org/viewvc?view=revision&revision=1451579 LUCENE-4806 : change facet delim character to use 3 bytes instead of 1 (in UTF-8)
        Hide
        thetaphi Uwe Schindler added a comment -

        Closed after release.

        Show
        thetaphi Uwe Schindler added a comment - Closed after release.

          People

          • Assignee:
            mikemccand Michael McCandless
            Reporter:
            mikemccand Michael McCandless
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development