Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-7136

Add an AutoPhrasing TokenFilter

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      Adds an 'autophrasing' token filter which is designed to enable noun phrases that represent a single entity to be tokenized in a singular fashion. Adds support for ManagedResources and Query parser auto-phrasing support given LUCENE-2605.

      The rationale for this Token Filter and its use in solving the long standing multi-term synonym problem in Lucene Solr has been documented online.

      http://lucidworks.com/blog/automatic-phrase-tokenization-improving-lucene-search-precision-by-more-precise-linguistic-analysis/

      https://lucidworks.com/blog/solution-for-multi-term-synonyms-in-lucenesolr-using-the-auto-phrasing-tokenfilter/

      Attachments

        1. AutoPhaseFiniteStateDiagram.pdf
          176 kB
          Koorosh Vakhshoori
        2. SOLR-7136.patch
          142 kB
          Koorosh Vakhshoori
        3. SOLR-7136.patch
          84 kB
          Ted Sullivan
        4. SOLR-7136.patch
          81 kB
          Ted Sullivan
        5. SOLR-7136.patch
          43 kB
          Ted Sullivan

        Activity

          People

            Unassigned Unassigned
            tedsullivan Ted Sullivan
            Votes:
            6 Vote for this issue
            Watchers:
            14 Start watching this issue

            Dates

              Created:
              Updated: