Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-8202

Add a FixedShingleFilter

    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 7.4
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      In LUCENE-3475 I tried to make a ShingleGraphFilter that could accept and emit arbitrary graphs, while duplicating all the functionality of the existing ShingleFilter.  This ends up being extremely hairy, and doesn't play well with query parsers.

      I'd like to step back and try and create a simpler shingle filter that can be used for index-time phrase tokenization only.  It will have a single fixed shingle size, can deal with single-token synonyms, and won't emit unigrams.

        Attachments

        1. LUCENE-8202.patch
          23 kB
          Alan Woodward
        2. LUCENE-8202.patch
          24 kB
          Alan Woodward
        3. LUCENE-8202.patch
          26 kB
          Alan Woodward
        4. LUCENE-8202-fixes.patch
          7 kB
          Alan Woodward

          Activity

            People

            • Assignee:
              romseygeek Alan Woodward
              Reporter:
              romseygeek Alan Woodward
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: