Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-5276

highlighter working using stemmed tokens from another field and text from another

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Duplicate
    • None
    • None
    • highlighter
    • None

    Description

      The case is like this:
      I have 'content' field with content extracted with tika and several fields for language versions (like content_pl, content_en, content_es, content_ru, etc).
      I also use custom langid component which can copy 'content' to serveral content_* fields if it detects more than one language (so those parts are properly stemmed in every language present in text).

      Now to use highlighter in such scenario I need to store all those language fields even if their contents is always same as the one in 'content' field.

      Would be nice if I could configure language specific fields to be not stored, and configure highlighter to take tokens positions from those fields and apply them to text in 'content' field...
      In other words - to say: take tokens from 'content_pl', and prepare highlight based on text in 'content' field.
      It could be administrator responsibility to guarantee that mapped fields have same content.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              redguy666 Maciej Lizewski
              Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: