Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-7193

Concatenate words from token stream

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • None
    • None
    • Schema and Analysis
    • None

    Description

      The user entered data often don't have proper spacing between words and words spelling and format also varies from data like business names, address etc. After tokenizing data, we might perform pattern replacement, stop word filtering etc. Later we want to concatenate all the tokens and generate n-grams token for indexing business name and perform the fuzzy match.

      Attachments

        1. concatenate_words.patch
          11 kB
          Abhishek Bafna

        Activity

          People

            Unassigned Unassigned
            abhishekbafna Abhishek Bafna
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: