Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.5, 6.0
    • Component/s: modules/spellchecker
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      This is like StopFilter, except if the token is the very last token
      and there were no non-token characters after it, it keeps the token.

      This is useful with analyzing suggesters (AnalyzingSuggester,
      AnalyzingInfixSuggester, FuzzySuggester), where you often want to
      remove stop words, but not if it's the last word and the user hasn't
      finished typing it.

      E.g. "fast a" might complete to "fast amoeba", but if you simply use
      StopFilter then the a is removed.

      Really our analysis APIs aren't quite designed to handle a "partial"
      tokens that suggesters need to work with.

      1. LUCENE-5165.patch
        31 kB
        Michael McCandless

        Issue Links

          Activity

          Hide
          Michael McCandless added a comment -

          Patch, I think it's ready... it [sneakily] calls end() from its
          incrementToken and then looks at the final endOffset to decide whether
          to filter the stopword or not.

          I've pushed it to http://jirasearch.mikemccandless.com and now "fail
          if byte" gets the right suggestion (before it got no suggestions,
          because I was previously keeping stop words at lookup time to
          workaround the issue).

          Show
          Michael McCandless added a comment - Patch, I think it's ready... it [sneakily] calls end() from its incrementToken and then looks at the final endOffset to decide whether to filter the stopword or not. I've pushed it to http://jirasearch.mikemccandless.com and now "fail if byte" gets the right suggestion (before it got no suggestions, because I was previously keeping stop words at lookup time to workaround the issue).
          Hide
          Robert Muir added a comment -

          This looks good, i like the BaseTokenStreamTestCase improvements especially.

          Show
          Robert Muir added a comment - This looks good, i like the BaseTokenStreamTestCase improvements especially.
          Hide
          ASF subversion and git services added a comment -

          Commit 1513940 from Michael McCandless in branch 'dev/trunk'
          [ https://svn.apache.org/r1513940 ]

          LUCENE-5165: add SuggestStopFilter

          Show
          ASF subversion and git services added a comment - Commit 1513940 from Michael McCandless in branch 'dev/trunk' [ https://svn.apache.org/r1513940 ] LUCENE-5165 : add SuggestStopFilter
          Hide
          ASF subversion and git services added a comment -

          Commit 1513942 from Michael McCandless in branch 'dev/branches/branch_4x'
          [ https://svn.apache.org/r1513942 ]

          LUCENE-5165: add SuggestStopFilter

          Show
          ASF subversion and git services added a comment - Commit 1513942 from Michael McCandless in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1513942 ] LUCENE-5165 : add SuggestStopFilter
          Hide
          Adrien Grand added a comment -

          4.5 release -> bulk close

          Show
          Adrien Grand added a comment - 4.5 release -> bulk close

            People

            • Assignee:
              Michael McCandless
              Reporter:
              Michael McCandless
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development