Solr
  1. Solr
  2. SOLR-3534

dismax and edismax should default to "df" when "qf" is absent.

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 4.0-ALPHA
    • Fix Version/s: 4.0-ALPHA
    • Component/s: query parsers
    • Labels:
      None

      Description

      The dismax and edismax query parsers should default to "df" when the "qf" parameter is absent. They only use the defaultSearchField in schema.xml as a fallback now.

        Issue Links

          Activity

          David Smiley made changes -
          Status Open [ 1 ] Closed [ 6 ]
          Fix Version/s 4.0 [ 12314992 ]
          Resolution Fixed [ 1 ]
          Hide
          David Smiley added a comment -

          Committed to 4x & trunk.

          Show
          David Smiley added a comment - Committed to 4x & trunk.
          Hide
          Jack Krupansky added a comment -

          I think it would be helpful to users and ease the support burden if we added a warning to the comment for <defaultSchemaField> (whether deprecated or not) that says "WARNING: The "df" request parameter, including the default in any query request handlers, overide this default. So, be sure to review how "df" may be used for relevant request handlers before specifying a last-resort default search field here."

          Similar wording should be included in SOLR-2724.

          Show
          Jack Krupansky added a comment - I think it would be helpful to users and ease the support burden if we added a warning to the comment for <defaultSchemaField> (whether deprecated or not) that says "WARNING: The "df" request parameter, including the default in any query request handlers, overide this default. So, be sure to review how "df" may be used for relevant request handlers before specifying a last-resort default search field here." Similar wording should be included in SOLR-2724 .
          Hide
          David Smiley added a comment -

          Whoops – that commit (#1350466) was mis-commented to SOLR-3304.

          Show
          David Smiley added a comment - Whoops – that commit (#1350466) was mis-commented to SOLR-3304 .
          David Smiley made changes -
          Hide
          David Smiley added a comment -

          This is an updated patch. Instead of SolrPluginUtils, I chose QueryParsing which already has a similar method for q.op. And like q.op I had the 2nd arg be the string that the caller resolves. Some callers don't have a convenient params to provide. The fact that some don't led me to start doing more refactorings to QParser that I decided to withdraw from as to not make this issue do too much at once.

          I already committed test modifications so that this patch will pass. (I jumped the gun perhaps but no matter.) You should see this change in the subversion tab in JIRA.

          Show
          David Smiley added a comment - This is an updated patch. Instead of SolrPluginUtils, I chose QueryParsing which already has a similar method for q.op. And like q.op I had the 2nd arg be the string that the caller resolves. Some callers don't have a convenient params to provide. The fact that some don't led me to start doing more refactorings to QParser that I decided to withdraw from as to not make this issue do too much at once. I already committed test modifications so that this patch will pass. (I jumped the gun perhaps but no matter.) You should see this change in the subversion tab in JIRA.
          Hide
          Hoss Man added a comment -

          The point of the test is to assert that DismaxQParser can function correctly with nothing but a "q" param and a schema specifying a defaultSearchField. If that's covered by another test you're adding (or that already exists) then great, we don't need it.

          Show
          Hoss Man added a comment - The point of the test is to assert that DismaxQParser can function correctly with nothing but a "q" param and a schema specifying a defaultSearchField. If that's covered by another test you're adding (or that already exists) then great, we don't need it.
          Hide
          David Smiley added a comment -

          TestExtendedDismaxParser line 126 already tests that defaultSearchField is consulted. In this patch I added another test above it to ensure that "df" is consulted.

          Show
          David Smiley added a comment - TestExtendedDismaxParser line 126 already tests that defaultSearchField is consulted. In this patch I added another test above it to ensure that "df" is consulted.
          Hide
          Hoss Man added a comment -

          I'll presume that you don't mean "<defaultSearchField/>" literally, you mean "<defaultSearchField>text</defaultSearchField/>".

          yes, sorry .. i was using "<defaultSearchField/>" as shorthand for <defaultSearchField>...something...</defaultSearchField/>, that was bad on my part and totally confusing.

          So are you effectively saying that schema-minimal.xml should add a defaultSearchField to it?

          No, i'm saying that as long as "<defaultSearchField>...</defaultSearchField/>" is legal and supported configuration, then this specific test (of "dismaxNoDefaults") should use a schema that has a "<defaultSearchField>...</defaultSearchField/>" in it since that's the point of the test.

          schema-minimal.xml should certainly not have a "<defaultSearchField>..." added, since that would no longer truely be a minimal schema.xml

          Show
          Hoss Man added a comment - I'll presume that you don't mean "<defaultSearchField/>" literally, you mean "<defaultSearchField>text</defaultSearchField/>". yes, sorry .. i was using "<defaultSearchField/>" as shorthand for <defaultSearchField>...something...</defaultSearchField/>, that was bad on my part and totally confusing. So are you effectively saying that schema-minimal.xml should add a defaultSearchField to it? No, i'm saying that as long as "<defaultSearchField>...</defaultSearchField/>" is legal and supported configuration, then this specific test (of "dismaxNoDefaults") should use a schema that has a "<defaultSearchField>...</defaultSearchField/>" in it since that's the point of the test. schema-minimal.xml should certainly not have a "<defaultSearchField>..." added, since that would no longer truely be a minimal schema.xml
          Hide
          David Smiley added a comment -

          Hoss, I like your suggestion of refactoring this to SolrPluginUtils (not *Tools which doesn't exist). And also I realized that SolrParams.get() takes a 2nd arg for the default which can be s.getDefaultSearchFieldName(), simplifying this even more.

          As Bernd noted, that test was written at a time when the schema.xml used by the test had a <defaultSearchField/> declared – that was/is the entire point of the test: that the Dismax(Handler|QParser) could work with a "<defaultSearchField/>" and a "q" and no other params specified. As long as "<defaultSearchField/>" is legal (even if it's deprecated and not mentioned in the example schema.xml) a test like that should exist somewhere shouldn't it? (if/when "<defaultSearchField/>" is no longer legal, then certainly change the test to add a "df" param and assert that it fails if one isn't specified)

          I'm confused by this, especially since you "+1"'ed on throwing an exception. I'll presume that you don't mean "<defaultSearchField/>" literally, you mean "<defaultSearchField>text</defaultSearchField/>". So are you effectively saying that schema-minimal.xml should add a defaultSearchField to it?

          Show
          David Smiley added a comment - Hoss, I like your suggestion of refactoring this to SolrPluginUtils (not *Tools which doesn't exist). And also I realized that SolrParams.get() takes a 2nd arg for the default which can be s.getDefaultSearchFieldName(), simplifying this even more. As Bernd noted, that test was written at a time when the schema.xml used by the test had a <defaultSearchField/> declared – that was/is the entire point of the test: that the Dismax(Handler|QParser) could work with a "<defaultSearchField/>" and a "q" and no other params specified. As long as "<defaultSearchField/>" is legal (even if it's deprecated and not mentioned in the example schema.xml) a test like that should exist somewhere shouldn't it? (if/when "<defaultSearchField/>" is no longer legal, then certainly change the test to add a "df" param and assert that it fails if one isn't specified) I'm confused by this, especially since you "+1"'ed on throwing an exception. I'll presume that you don't mean "<defaultSearchField/>" literally, you mean "<defaultSearchField>text</defaultSearchField/>". So are you effectively saying that schema-minimal.xml should add a defaultSearchField to it?
          Hide
          Hoss Man added a comment -

          dismax&edismax should look at 'df' before falling back to defaultSearchField

          +1 ... i thought it already did that, but i guess not. If we are "deprecating/discouraging" <defaultSearchField/> and instructing people to use "df" instead, then we should absolutely make 100% certain any code path we ship that currently consults <defaultSearchField/> checks "df" first. (if/when the code paths that consult <defaultSearchField/> are removed, they should still consult "df")

          dismax&edismax should throw an exception if neither 'qf', 'df', nor defaultSearchField are specified, because these two query parsers are fairly useless without them.

          +1 .. (although i suppose edismax could still be usable if every clause is fully qualified with a fieldname/alias and fail only when a clause that requires a default is encountered ... just like the LuceneQParser)

          I ran all tests before committing and found the MinimalSchemaTest failed related to the "dismaxNoDefaults" request handler in the test solrconfig.xml which was added in SOLR-1776. The problem is throwing an exception if there's no 'qf', 'df', or default search field. I disagree with that test – it is erroneous/misleading to use dismax without setting specifying via any of those 3 mechanisms. I am inclined to delete the "dismaxNoDefaults" test request handler (assuming there are no other ramifications). I want to get input from Hoss who put it there so I'll wait.

          As Bernd noted, that test was written at a time when the schema.xml used by the test had a <defaultSearchField/> declared – that was/is the entire point of the test: that the Dismax(Handler|QParser) could work with a "<defaultSearchField/>" and a "q" and no other params specified. As long as "<defaultSearchField/>" is legal (even if it's deprecated and not mentioned in the example schema.xml) a test like that should exist somewhere shouldn't it? (if/when "<defaultSearchField/>" is no longer legal, then certainly change the test to add a "df" param and assert that it fails if one isn't specified)

          The current patch looks like a great start to me ... but i would suggest refactoring this core little bit into it's own method in SolrPluginTools and replacing every use of getDefaultSearchFieldName in the code base with it (and add a link to it from getDefaultSearchFieldName javadocs encouraging people to use it instead)...

          /**
           * returns the effective default field based on the params or 
           * hardcoded schema default.  may be null if either exists specified.
           * @see CommonParams#DF
           * @see IndexSchema#getDefaultSearchFieldName
           */
          public static String getDefaultField(final IndexSchema s, final SolrParams p) {
            String df = p.get(CommonParams.DF);
            if (df == null) {
              df = s.getDefaultSearchFieldName();
            }
            return df;
          }
          
          Show
          Hoss Man added a comment - dismax&edismax should look at 'df' before falling back to defaultSearchField +1 ... i thought it already did that, but i guess not. If we are "deprecating/discouraging" <defaultSearchField/> and instructing people to use "df" instead, then we should absolutely make 100% certain any code path we ship that currently consults <defaultSearchField/> checks "df" first. (if/when the code paths that consult <defaultSearchField/> are removed, they should still consult "df") dismax&edismax should throw an exception if neither 'qf', 'df', nor defaultSearchField are specified, because these two query parsers are fairly useless without them. +1 .. (although i suppose edismax could still be usable if every clause is fully qualified with a fieldname/alias and fail only when a clause that requires a default is encountered ... just like the LuceneQParser) I ran all tests before committing and found the MinimalSchemaTest failed related to the "dismaxNoDefaults" request handler in the test solrconfig.xml which was added in SOLR-1776 . The problem is throwing an exception if there's no 'qf', 'df', or default search field. I disagree with that test – it is erroneous/misleading to use dismax without setting specifying via any of those 3 mechanisms. I am inclined to delete the "dismaxNoDefaults" test request handler (assuming there are no other ramifications). I want to get input from Hoss who put it there so I'll wait. As Bernd noted, that test was written at a time when the schema.xml used by the test had a <defaultSearchField/> declared – that was/is the entire point of the test: that the Dismax(Handler|QParser) could work with a "<defaultSearchField/>" and a "q" and no other params specified. As long as "<defaultSearchField/>" is legal (even if it's deprecated and not mentioned in the example schema.xml) a test like that should exist somewhere shouldn't it? (if/when "<defaultSearchField/>" is no longer legal, then certainly change the test to add a "df" param and assert that it fails if one isn't specified) – The current patch looks like a great start to me ... but i would suggest refactoring this core little bit into it's own method in SolrPluginTools and replacing every use of getDefaultSearchFieldName in the code base with it (and add a link to it from getDefaultSearchFieldName javadocs encouraging people to use it instead)... /** * returns the effective default field based on the params or * hardcoded schema default . may be null if either exists specified. * @see CommonParams#DF * @see IndexSchema#getDefaultSearchFieldName */ public static String getDefaultField( final IndexSchema s, final SolrParams p) { String df = p.get(CommonParams.DF); if (df == null ) { df = s.getDefaultSearchFieldName(); } return df; }
          Hide
          David Smiley added a comment -

          Just to keep these concerns separated, this issue, SOLR-3534 is about two things:

          • dismax&edismax should look at 'df' before falling back to defaultSearchField
          • dismax&edismax should throw an exception if neither 'qf', 'df', nor defaultSearchField are specified, because these two query parsers are fairly useless without them.

          SOLR-2724 is about the deprecation of defaultSearchField

          Show
          David Smiley added a comment - Just to keep these concerns separated, this issue, SOLR-3534 is about two things: dismax&edismax should look at 'df' before falling back to defaultSearchField dismax&edismax should throw an exception if neither 'qf', 'df', nor defaultSearchField are specified, because these two query parsers are fairly useless without them. SOLR-2724 is about the deprecation of defaultSearchField
          Hide
          David Smiley added a comment -

          defaultSearchField may be referenced in a bunch of places but it is always a default for something else that you should be specifying (typically 'df'). I've commented out my defaultSearchField long before it was deprecated.

          Show
          David Smiley added a comment - defaultSearchField may be referenced in a bunch of places but it is always a default for something else that you should be specifying (typically 'df'). I've commented out my defaultSearchField long before it was deprecated.
          Hide
          Bernd Fehling added a comment -

          Because of defaultSearchField that's why I stated in SOLR-2724 that 3.6.x (and obviously 4.x also) is now broken.
          There are many places where getDefaultSearchFieldName() is called and noone looked through the code after setting defaultSearchField to deprecated in schema.xml.

          Show
          Bernd Fehling added a comment - Because of defaultSearchField that's why I stated in SOLR-2724 that 3.6.x (and obviously 4.x also) is now broken. There are many places where getDefaultSearchFieldName() is called and noone looked through the code after setting defaultSearchField to deprecated in schema.xml.
          Hide
          David Smiley added a comment - - edited

          I ran all tests before committing and found the MinimalSchemaTest failed related to the "dismaxNoDefaults" request handler in the test solrconfig.xml which was added in SOLR-1776. The problem is throwing an exception if there's no 'qf', 'df', or default search field. I disagree with that test – it is erroneous/misleading to use dismax without setting specifying via any of those 3 mechanisms. I am inclined to delete the "dismaxNoDefaults" test request handler (assuming there are no other ramifications). I want to get input from Hoss who put it there so I'll wait.

          Show
          David Smiley added a comment - - edited I ran all tests before committing and found the MinimalSchemaTest failed related to the "dismaxNoDefaults" request handler in the test solrconfig.xml which was added in SOLR-1776 . The problem is throwing an exception if there's no 'qf', 'df', or default search field. I disagree with that test – it is erroneous/misleading to use dismax without setting specifying via any of those 3 mechanisms. I am inclined to delete the "dismaxNoDefaults" test request handler (assuming there are no other ramifications). I want to get input from Hoss who put it there so I'll wait.
          David Smiley made changes -
          Link This issue relates to SOLR-1776 [ SOLR-1776 ]
          Hide
          David Smiley added a comment -

          Jack: I'll improve the exception wording as you suggest.

          Any idea what happens for the classic/Solr or flex query parsers if default search field is not present?

          The lucene query parser (which is the default) doesn't technically require a default field, but if there is ambiguity in the query (i.e. a simple search word) then you get an exception.

          Show
          David Smiley added a comment - Jack: I'll improve the exception wording as you suggest. Any idea what happens for the classic/Solr or flex query parsers if default search field is not present? The lucene query parser (which is the default) doesn't technically require a default field, but if there is ambiguity in the query (i.e. a simple search word) then you get an exception.
          Hide
          Jack Krupansky added a comment -

          Looks like a reasonable compromise. I would edit the exception so that it reads like you just said: "neither 'qf', 'df' nor the default search field are present" or at least add "Neither" in front of what you currently have.

          Any idea what happens for the classic/Solr or flex query parsers if default search field is not present?

          Show
          Jack Krupansky added a comment - Looks like a reasonable compromise. I would edit the exception so that it reads like you just said: "neither 'qf', 'df' nor the default search field are present" or at least add "Neither" in front of what you currently have. Any idea what happens for the classic/Solr or flex query parsers if default search field is not present?
          Hide
          David Smiley added a comment -

          Bernd:
          In my patch I did throw an exception if neither 'qf', 'df' nor the default search field were present.

          It's tempting to log warnings if a default is relied upon that is inadvisable (like defaultSearchField), but that could flood logs. A one-time flag could be set to prevent this I guess.

          Show
          David Smiley added a comment - Bernd: In my patch I did throw an exception if neither 'qf', 'df' nor the default search field were present. It's tempting to log warnings if a default is relied upon that is inadvisable (like defaultSearchField), but that could flood logs. A one-time flag could be set to prevent this I guess.
          David Smiley made changes -
          Field Original Value New Value
          Attachment SOLR-3534_dismax_and_edismax_should_default_to_df_if_qf_is_absent.patch [ 12531832 ]
          Hide
          David Smiley added a comment -

          Attached is a patch, with a test. I pulled out this logic into a static method so that both dismax and edismax could use it, just as it was done for parsing MM. I'll apply this patch tomorrow.

          Show
          David Smiley added a comment - Attached is a patch, with a test. I pulled out this logic into a static method so that both dismax and edismax could use it, just as it was done for parsing MM. I'll apply this patch tomorrow.
          Hide
          Bernd Fehling added a comment -

          I think that there shouldn't be any hardwired config, but the fallback should report/log its action so that the user has the chance to find his config problem and correct it.
          And instead of silently providing zero search results I would recommend an exception.

          Show
          Bernd Fehling added a comment - I think that there shouldn't be any hardwired config, but the fallback should report/log its action so that the user has the chance to find his config problem and correct it. And instead of silently providing zero search results I would recommend an exception.
          Hide
          Erik Hatcher added a comment -

          Isn't "text" the traditional default search field? Including in the Solr example schema.

          Yes, it's in the example schema.... but we should not hardcode field names anywhere. The configuration should determine any "default" field names, and thus why "text" is wired into the example schema and solrconfig files liberally.

          Show
          Erik Hatcher added a comment - Isn't "text" the traditional default search field? Including in the Solr example schema. Yes, it's in the example schema.... but we should not hardcode field names anywhere. The configuration should determine any "default" field names, and thus why "text" is wired into the example schema and solrconfig files liberally.
          Hide
          Jack Krupansky added a comment -

          Okay, but if df does not have a default value, then at least have it throw an exception at an early stage, like core/handler load, rather than result in queries with an empty field name that returns zero results as it does today.

          Although, I suppose it is possible that a particular application may intend to set df explicitly on each request and not need/want a default. But, even in that case, I think an exception should be thrown rather than accept an empty field name and silently provide zero search results.

          I am still in favor of "text" as the hardwired default. Isn't "text" the traditional default search field? Including in the Solr example schema.

          But, that said, I will certainly defer to the rest of the Solr community.

          Show
          Jack Krupansky added a comment - Okay, but if df does not have a default value, then at least have it throw an exception at an early stage, like core/handler load, rather than result in queries with an empty field name that returns zero results as it does today. Although, I suppose it is possible that a particular application may intend to set df explicitly on each request and not need/want a default. But, even in that case, I think an exception should be thrown rather than accept an empty field name and silently provide zero search results. I am still in favor of "text" as the hardwired default. Isn't "text" the traditional default search field? Including in the Solr example schema. But, that said, I will certainly defer to the rest of the Solr community.
          Hide
          David Smiley added a comment -

          RE "text" default – that would be yet another default and worse, IMO, is it would be too hidden of a default. Being explicit by specifying a parameter on the request is best, IMO.

          Show
          David Smiley added a comment - RE "text" default – that would be yet another default and worse, IMO, is it would be too hidden of a default. Being explicit by specifying a parameter on the request is best, IMO.
          Hide
          Jack Krupansky added a comment -

          I would also suggest that the default if neither qf or df is present should be "text" preferably as a symbolic constant.

          Show
          Jack Krupansky added a comment - I would also suggest that the default if neither qf or df is present should be "text" preferably as a symbolic constant.
          David Smiley created issue -

            People

            • Assignee:
              David Smiley
              Reporter:
              David Smiley
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development