Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-2978

Language variable expansion in field names for search results clustering

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • contrib - Clustering
    • None

    Description

      This is a follow-up of SOLR-2939.

      Another feature of language recognizer we should mirror is the expansion of the lang token in field names into the language code of the document (in case of multiple languages per document - the first Carrot2-supported language code). The feature seems easy to implement in the non-distributed setting of Solr, but the simple implementation isn't going to work in the distributed setting because the name of the specific field to be fetched depends on the content (language) of each matching document. Looking at the SearchClusteringEngine.getFieldsToLoad(SolrQueryRequest) method, a quick but costly solution would be to load the contents of all stored fields. I'm not too strong in distributed-mode Solr, but maybe this could be optimized so that only the required fields get fetched?

      Attachments

        Activity

          People

            stanislaw.osinski Stanislaw Osinski
            stanislaw.osinski Stanislaw Osinski
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated: