Lucene - Core
  1. Lucene - Core
  2. LUCENE-1903

Incorrect ShingleFilter behavior when outputUnigrams == false

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 2.9
    • Fix Version/s: 2.9
    • Component/s: modules/analysis
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      ShingleFilter isn't working as expected when outputUnigrams == false. In particular, it is outputting unigrams at least some of the time when outputUnigrams==false.

      I'll attach a patch to ShingleFilterTest.java that adds some test cases that demonstrate the problem.

      I haven't checked this, but I hypothesize that the behavior for outputUnigrams == false got changed when the class was upgraded to the new TokenStream API?

      1. TEST-org.apache.lucene.analysis.shingle.ShingleFilterTest.xml
        15 kB
        Chris Harris
      2. LUCENE-1903_testcases_lucene2_4_1_version.patch
        5 kB
        Chris Harris
      3. LUCENE-1903_testcases.patch
        5 kB
        Chris Harris
      4. LUCENE-1903.patch
        7 kB
        Uwe Schindler
      5. LUCENE-1903.patch
        8 kB
        Chris Harris

        Issue Links

          Activity

          No work has yet been logged on this issue.

            People

            • Assignee:
              Uwe Schindler
              Reporter:
              Chris Harris
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development