Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-6914

DecimalDigitFilter skips characters in some cases (supplemental?)

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 5.4
    • Fix Version/s: 5.5.4, 6.3, 7.0
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      Found this while writing up the solr ref guide for DecimalDigitFilter.

      With input like "𝟙𝟡𝟠𝟜" ("Double Struck" 1984) the filter produces "1𝟡8𝟜" (1, double struck 9, 8, double struck 4) add some non-decimal characters in between the digits (ie: "𝟙x𝟡x𝟠x𝟜") and you get the expected output ("1x9x8x4"). This doesn't affect all decimal characters though, as evident by the existing test cases.

      Perhaps this is an off by one bug in the "if the original was supplementary, shrink the string" code path?

        Attachments

        1. LUCENE-6914.patch
          8 kB
          Hoss Man
        2. LUCENE-6914.patch
          4 kB
          Hoss Man
        3. LUCENE-6914.patch
          0.9 kB
          Hoss Man

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                hossman Hoss Man
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: