Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-7136

Add an AutoPhrasing TokenFilter

    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Adds an 'autophrasing' token filter which is designed to enable noun phrases that represent a single entity to be tokenized in a singular fashion. Adds support for ManagedResources and Query parser auto-phrasing support given LUCENE-2605.

      The rationale for this Token Filter and its use in solving the long standing multi-term synonym problem in Lucene Solr has been documented online.

      http://lucidworks.com/blog/automatic-phrase-tokenization-improving-lucene-search-precision-by-more-precise-linguistic-analysis/

      https://lucidworks.com/blog/solution-for-multi-term-synonyms-in-lucenesolr-using-the-auto-phrasing-tokenfilter/

        Attachments

        1. AutoPhaseFiniteStateDiagram.pdf
          176 kB
          Koorosh Vakhshoori
        2. SOLR-7136.patch
          142 kB
          Koorosh Vakhshoori
        3. SOLR-7136.patch
          84 kB
          Ted Sullivan
        4. SOLR-7136.patch
          81 kB
          Ted Sullivan
        5. SOLR-7136.patch
          43 kB
          Ted Sullivan

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              tedsullivan Ted Sullivan
            • Votes:
              8 Vote for this issue
              Watchers:
              16 Start watching this issue

              Dates

              • Created:
                Updated: