Solr
  1. Solr
  2. SOLR-405

Search additional fields when using DisMaxRequestHandler

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Won't Fix
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: search
    • Labels:
      None

      Description

      We are heavily leaning towards using a few DisMaxRequestHandlers for searching instead of copy fields, but we ran into an issue. Currently our sites use something like a copy field to search stories, but they also need to search additional fields (like story_source, which we don't want in the dismax). With the DisMaxRequestHandler as it is, anything you have in the q param is searched for in the fields defined in the DisMaxRequestHandler. We need a little more flexibility with this.

      As an example, if you search for something like "bush+AND+story_source:associated", all the fields in the dismax are searched for 'bush' and 'story_source:associated'. (The story_source field is not in the dismax handler, and we don't want it to be.) What we want to do is search the fields defined in the dismax for 'bush', but also query the story_source field (and only the story_source field) for 'associated'.

      We came up with this small patch to let us do what we need, but wanted to throw it out there in case others were interested, or know of a better way to do this. We're not entirely sure we did this in the right place and are hoping that maybe someone can provide some insight on that as well.

        Issue Links

          Activity

          Hide
          Doug Steigerwald added a comment -

          Small patch that modified the DisjunctionMaxQueryParser to allow the search of additional fields.

          Anything in the q param that has a specific field to search is just searched on that field, otherwise the DisMax query is created the same way.

          Also making sure we can allow people to search for things like "mailto:doug@...".

          Show
          Doug Steigerwald added a comment - Small patch that modified the DisjunctionMaxQueryParser to allow the search of additional fields. Anything in the q param that has a specific field to search is just searched on that field, otherwise the DisMax query is created the same way. Also making sure we can allow people to search for things like "mailto:doug@...".
          Hide
          Yonik Seeley added a comment -

          The specific example you give would best be accomplished by a filter.
          fq=story_source:associated
          The only issue is if you wanted relevancy scores for these other parts included in the main score.

          Show
          Yonik Seeley added a comment - The specific example you give would best be accomplished by a filter. fq=story_source:associated The only issue is if you wanted relevancy scores for these other parts included in the main score.
          Hide
          Hoss Man added a comment -

          ditto yonik's comment.

          this patch also violates the spirit of dismax, which is that user providing the "q" param shouldn't be expected to know what the field names of hte index are ... not to mention there are a probably a lot of fields you don't want the user able to "accidently" query against specificly.

          If people like the idea of allowing stuff like this, a cleaner way would be to allow some dismax options for specifing a list of mapping of "field aliases" you want to advertise to your users and the real field names you want them to corrispond to (so you can tell users they can search for "author:Bob" but behind the scenes you search for author_text:Bob) then:
          1) have the dismax handler register these "aliases" on the dismax parser (it already supports aliases, the current behavior comes from the fact that the default field is aliased to list of things in the "qf")
          2) modify the partialEscape function to know about the list of aliases and don't escape any colon that appears after the name of a configured field alias.

          Show
          Hoss Man added a comment - ditto yonik's comment. this patch also violates the spirit of dismax, which is that user providing the "q" param shouldn't be expected to know what the field names of hte index are ... not to mention there are a probably a lot of fields you don't want the user able to "accidently" query against specificly. If people like the idea of allowing stuff like this, a cleaner way would be to allow some dismax options for specifing a list of mapping of "field aliases" you want to advertise to your users and the real field names you want them to corrispond to (so you can tell users they can search for "author:Bob" but behind the scenes you search for author_text:Bob) then: 1) have the dismax handler register these "aliases" on the dismax parser (it already supports aliases, the current behavior comes from the fact that the default field is aliased to list of things in the "qf") 2) modify the partialEscape function to know about the list of aliases and don't escape any colon that appears after the name of a configured field alias.
          Hide
          Otis Gospodnetic added a comment -

          Yonik is right, but I think it would still be nice to allow field:XXX + DisMax combinations and have the field:XXX influence relevancy score.

          Hoss example seems to assume the searching is always done by people (hence the possible need to hide some fields). But many times Solr is searched "programmatically" by other apps (while humans sleep), so there is no need to hide.

          Now that query parsers are pluggable, I wonder if we could have a custom QP that allows this... Doug?

          Show
          Otis Gospodnetic added a comment - Yonik is right, but I think it would still be nice to allow field:XXX + DisMax combinations and have the field:XXX influence relevancy score. Hoss example seems to assume the searching is always done by people (hence the possible need to hide some fields). But many times Solr is searched "programmatically" by other apps (while humans sleep), so there is no need to hide. Now that query parsers are pluggable, I wonder if we could have a custom QP that allows this... Doug?
          Hide
          Jan Høydahl added a comment -

          You can use edismax for this

          SPRING_CLEANING_2013

          Show
          Jan Høydahl added a comment - You can use edismax for this SPRING_CLEANING_2013

            People

            • Assignee:
              Unassigned
              Reporter:
              Doug Steigerwald
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development