Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-8812

ExtendedDismaxQParser (edismax) ignores Boolean OR when q.op=AND and mm is not explicitly set

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 5.5
    • Fix Version/s: 5.5.2, 5.6, 6.0.2, 6.1, 7.0
    • Component/s: query parsers
    • Labels:
      None

      Description

      The edismax parser ignores Boolean OR in queries when q.op=AND. This behavior is new to Solr 5.5.0 and an unexpected major change.

      Example:
      "q": "id:12345 OR zzzzzzzzzz",
      "defType": "edismax",
      "q.op": "AND",
      where "12345" is a known document ID and "zzzzzzzzzz" is a string NOT present in my data

      Version 5.5.0 produces zero results:
      "rawquerystring": "id:12345 OR zzzzzzzzzz",
      "querystring": "id:12345 OR zzzzzzzzzz",
      "parsedquery": "(+((id:12345 DisjunctionMaxQuery((text:zzzzzzzzzz)))~2))/no_coord",
      "parsedquery_toString": "+((id:12345 (text:zzzzzzzzzz))~2)",
      "explain": {},
      "QParser": "ExtendedDismaxQParser"

      Version 5.4.0 produces one result as expected
      "rawquerystring": "id:12345 OR zzzzzzzzzz",
      "querystring": "id:12345 OR zzzzzzzzzz",
      "parsedquery": "(+(id:12345 DisjunctionMaxQuery((text:zzzzzzzzzz))))/no_coord",
      "parsedquery_toString": "+(id:12345 (text:zzzzzzzzzz))"
      "explain": {},
      "QParser": "ExtendedDismaxQParser"

      1. SOLR-8812.patch
        10 kB
        Steve Rowe
      2. SOLR-8812.patch
        5 kB
        Greg Pendlebury
      3. SOLR-8812.patch
        1.0 kB
        Jan Høydahl
      4. SOLR-8812-barbie.patch
        3 kB
        Greg Pendlebury

        Issue Links

          Activity

          Hide
          jkrupan Jack Krupansky added a comment -

          The difference in the generated query appears to be the "))~2" which indicates a BooleanQuery with a minShouldMatch of 2 which means that both OR/SHOULD terms MUST match, effectively turning SHOULD/OR into MUST/AND.

          I'm guessing it was this 5.5 change: SOLR-2649:

          * SOLR-2649: MM ignored in edismax queries with operators.
            (Greg Pendlebury, Jan Høydahl et. al. via Erick Erickson)
          

          I think q.op=AND simply sets MM=100%, effectively overriding the explicit OR.

          Show
          jkrupan Jack Krupansky added a comment - The difference in the generated query appears to be the "))~2" which indicates a BooleanQuery with a minShouldMatch of 2 which means that both OR/SHOULD terms MUST match, effectively turning SHOULD/OR into MUST/AND. I'm guessing it was this 5.5 change: SOLR-2649 : * SOLR-2649: MM ignored in edismax queries with operators. (Greg Pendlebury, Jan Høydahl et. al. via Erick Erickson) I think q.op=AND simply sets MM=100%, effectively overriding the explicit OR.
          Hide
          ryanmax Ryan Steinberg added a comment -

          I agree: SOLR-2649 likely introduced this new behavior. I just read through the comments on SOLR-2649 and I'm still not sure this was intended: effectively, explicit OR is no longer possible when q.op=AND, even in the absence of an explicit mm param. After reading this helpful blog post referenced in the ExtendedDismaxQParser unit test, I now understand that AND takes precedence over OR but I'm not sure this is a clearly documented or anticipated consequence of this recent change.

          Show
          ryanmax Ryan Steinberg added a comment - I agree: SOLR-2649 likely introduced this new behavior. I just read through the comments on SOLR-2649 and I'm still not sure this was intended: effectively, explicit OR is no longer possible when q.op=AND, even in the absence of an explicit mm param. After reading this helpful blog post referenced in the ExtendedDismaxQParser unit test, I now understand that AND takes precedence over OR but I'm not sure this is a clearly documented or anticipated consequence of this recent change.
          Hide
          janhoy Jan Høydahl added a comment -

          Attaching patch with a failing test case. It is quite shocking that we apparently did not have test coverage of basic OR with q.op=AND already??

          Show
          janhoy Jan Høydahl added a comment - Attaching patch with a failing test case. It is quite shocking that we apparently did not have test coverage of basic OR with q.op=AND already??
          Hide
          janhoy Jan Høydahl added a comment -

          Promoting this bug as Blocker for now, as it directly affects correctness of many edismax queries, and likely to hit a large part of users upgrading to 5.5.0.

          Show
          janhoy Jan Høydahl added a comment - Promoting this bug as Blocker for now, as it directly affects correctness of many edismax queries, and likely to hit a large part of users upgrading to 5.5.0.
          Hide
          janhoy Jan Høydahl added a comment -

          Problem lies in SolrPluginUtils.setMinShouldMatch(BooleanQuery.Builder q, String spec, boolean mmAutoRelax), which before SOLR-2649 was only ever called if the query did not contain explicit OR or NOT, while now it is called always, but is not designed to handle those cases. The failing test has two clauses (text_sw:oil) (text_sw:stock) both of which are SHOULD. But still we return mm=2.

          Show
          janhoy Jan Høydahl added a comment - Problem lies in SolrPluginUtils.setMinShouldMatch(BooleanQuery.Builder q, String spec, boolean mmAutoRelax) , which before SOLR-2649 was only ever called if the query did not contain explicit OR or NOT , while now it is called always, but is not designed to handle those cases. The failing test has two clauses (text_sw:oil) (text_sw:stock) both of which are SHOULD . But still we return mm=2.
          Hide
          janhoy Jan Høydahl added a comment -

          Also, in SOLR-2649 we have

               public ExtendedSolrQueryParser(QParser parser, String defaultField) {
                 super(parser, defaultField);
          -      // don't trust that our parent class won't ever change its default
          -      setDefaultOperator(QueryParser.Operator.OR);
          +      // Respect the q.op parameter before mm will be applied later
          +      SolrParams defaultParams = SolrParams.wrapDefaults(parser.getLocalParams(), parser.getParams());
          +      QueryParser.Operator defaultOp = QueryParsing.getQueryParserDefaultOperator(
          +          parser.getReq().getSchema(), defaultParams.get(QueryParsing.OP));
          +      setDefaultOperator(defaultOp);
               }
          

          This means the mm calculations may now get REQUIRED clauses as input, while before SOLR-2649 when defaultOperator was forced to OR, all clauses would enter as OPTIONAL. Since minShouldMatch is supposed to be calculated only between optional clauses, this will fail.

          Show
          janhoy Jan Høydahl added a comment - Also, in SOLR-2649 we have public ExtendedSolrQueryParser(QParser parser, String defaultField) { super (parser, defaultField); - // don't trust that our parent class won't ever change its default - setDefaultOperator(QueryParser.Operator.OR); + // Respect the q.op parameter before mm will be applied later + SolrParams defaultParams = SolrParams.wrapDefaults(parser.getLocalParams(), parser.getParams()); + QueryParser.Operator defaultOp = QueryParsing.getQueryParserDefaultOperator( + parser.getReq().getSchema(), defaultParams.get(QueryParsing.OP)); + setDefaultOperator(defaultOp); } This means the mm calculations may now get REQUIRED clauses as input, while before SOLR-2649 when defaultOperator was forced to OR, all clauses would enter as OPTIONAL. Since minShouldMatch is supposed to be calculated only between optional clauses, this will fail.
          Hide
          rjosal Ryan Josal added a comment -

          On the topic of SOLR-2649, I just upgraded to 5.5 yesterday and SOLR-2649 broke one of our test cases which was "hair ties -barbie" should return hair ties but not barbie hair ties, and now it matches nothing. I assume this is intended, but if not, maybe this ticket also addresses it?

          Show
          rjosal Ryan Josal added a comment - On the topic of SOLR-2649 , I just upgraded to 5.5 yesterday and SOLR-2649 broke one of our test cases which was "hair ties -barbie" should return hair ties but not barbie hair ties, and now it matches nothing. I assume this is intended, but if not, maybe this ticket also addresses it?
          Hide
          janhoy Jan Høydahl added a comment -

          Erick Erickson Should we attempt a fix for 5.5.1 or do you prefer to revert SOLR-2649 until there is a better patch available?

          Show
          janhoy Jan Høydahl added a comment - Erick Erickson Should we attempt a fix for 5.5.1 or do you prefer to revert SOLR-2649 until there is a better patch available?
          Hide
          erickerickson Erick Erickson added a comment -

          Jan Høydahl I could go either way on this. As I stated in SOLR-2649, though, I'm out of my depth when dealing with the parsing here, plus I'm kind of slammed for the next while.

          So unless someone volunteers to create a patch, I'll revert SOLR-2649 before any 5.5.1 release.

          Show
          erickerickson Erick Erickson added a comment - Jan Høydahl I could go either way on this. As I stated in SOLR-2649 , though, I'm out of my depth when dealing with the parsing here, plus I'm kind of slammed for the next while. So unless someone volunteers to create a patch, I'll revert SOLR-2649 before any 5.5.1 release.
          Hide
          janhoy Jan Høydahl added a comment -

          Perhaps, after reverting, we could spend some effort adding better test coverage to edismax, to prevent similar regressions in the future.
          Then also add a bunch more tests for SOLR-2649, defining wanted behavior for all kind of corner cases.

          Show
          janhoy Jan Høydahl added a comment - Perhaps, after reverting, we could spend some effort adding better test coverage to edismax, to prevent similar regressions in the future. Then also add a bunch more tests for SOLR-2649 , defining wanted behavior for all kind of corner cases.
          Hide
          gpendleb Greg Pendlebury added a comment -

          I am happy to take a look at any issues, since I was involved in SOLR-2649. I need to get a new copy of the code first, but in the interim, can someone confirm that explicitly setting mm to 0 does not fix this? I believe mm defaults to 100%, so that may be the real culprit, as opposed to q.op=AND. Before SOLR-2649 was resolved, setting an OR operator would have caused mm to be ignored. Now it will use the default value unless you set it explicitly.

          Our production servers are using 5.1 with SOLR-2649 applied, and we have q.op=AND, with perfectly functional OR operators and mm=0%. All of the obvious queries work, including the cases referenced above.

          From memory there are a lot of subtle cliffs to fall off here, such as making sure we are talking about top level clauses and ultimately remembering that Solr does not use boolean logic... and there are some edge cases where it simply doesn't work the same way as the occurs flags. SHOULD vs OR is the main culprit.

          Show
          gpendleb Greg Pendlebury added a comment - I am happy to take a look at any issues, since I was involved in SOLR-2649 . I need to get a new copy of the code first, but in the interim, can someone confirm that explicitly setting mm to 0 does not fix this? I believe mm defaults to 100%, so that may be the real culprit, as opposed to q.op=AND. Before SOLR-2649 was resolved, setting an OR operator would have caused mm to be ignored. Now it will use the default value unless you set it explicitly. Our production servers are using 5.1 with SOLR-2649 applied, and we have q.op=AND, with perfectly functional OR operators and mm=0%. All of the obvious queries work, including the cases referenced above. From memory there are a lot of subtle cliffs to fall off here, such as making sure we are talking about top level clauses and ultimately remembering that Solr does not use boolean logic... and there are some edge cases where it simply doesn't work the same way as the occurs flags. SHOULD vs OR is the main culprit.
          Hide
          ryanmax Ryan Steinberg added a comment -

          I tested explicitly setting mm to 0 and all of my tests passed. I also added a mm=0 to the failing test case from Jan Høydahl and it passed too.

          Greg Pendlebury, I think your suspicion about mm defaulting to 100% is correct.

          Show
          ryanmax Ryan Steinberg added a comment - I tested explicitly setting mm to 0 and all of my tests passed. I also added a mm=0 to the failing test case from Jan Høydahl and it passed too. Greg Pendlebury , I think your suspicion about mm defaulting to 100% is correct.
          Hide
          gpendleb Greg Pendlebury added a comment -

          Thanks. Hopefully that is ok. I just installed git and started cloning trunk... now to upgrade to Java 8.

          I think it is all working as intended, it is just that there is a confusing legacy of not having to worry about what mm was set to for some use cases. SOLR-2649 will force people to check what the parameters are, but all queries are now supported.

          It would be nice if it was less disruptive, but given that pre-patch there was no way to get edismax to do certain queries, no matter what parameters you set, I think it is still an improvement.

          Show
          gpendleb Greg Pendlebury added a comment - Thanks. Hopefully that is ok. I just installed git and started cloning trunk... now to upgrade to Java 8. I think it is all working as intended, it is just that there is a confusing legacy of not having to worry about what mm was set to for some use cases. SOLR-2649 will force people to check what the parameters are, but all queries are now supported. It would be nice if it was less disruptive, but given that pre-patch there was no way to get edismax to do certain queries, no matter what parameters you set, I think it is still an improvement.
          Hide
          erickerickson Erick Erickson added a comment -

          Thanks to you both for jumping in here. I'll see what I can do to get this into any 5.5.1 and also 6.0 (and trunk of course).

          Jan Høydahl What do you think? We can raise adding tests as a separate JIRA...

          Show
          erickerickson Erick Erickson added a comment - Thanks to you both for jumping in here. I'll see what I can do to get this into any 5.5.1 and also 6.0 (and trunk of course). Jan Høydahl What do you think? We can raise adding tests as a separate JIRA...
          Hide
          janhoy Jan Høydahl added a comment -

          Setting q.op=AND or mm=100% should result in exactly the same behavior. The fact that setting mm=0 passes the test could be a coincidence more than a general workaround, perhaps an issue of the order things are done in the code. Would be interesting to see which code path is executed with q.op=AND&mm=0 though...

          Erick Erickson We can add more general tests in SOLR-8926.

          Show
          janhoy Jan Høydahl added a comment - Setting q.op=AND  or mm=100% should result in exactly the same behavior. The fact that setting mm=0 passes the test could be a coincidence more than a general workaround, perhaps an issue of the order things are done in the code. Would be interesting to see which code path is executed with q.op=AND&mm=0 though... Erick Erickson We can add more general tests in SOLR-8926 .
          Hide
          janhoy Jan Høydahl added a comment -

          I think your suspicion about mm defaulting to 100% is correct.

          Not entirely. The method DisMaxQParser#parseMinShouldMatch "Applies the appropriate default rules for the "mm" param based on the effective value of the "q.op" param", i.e. if mm is not explicitly set, it is set i 0 for q.op=OR and 100% for q.op=AND.

          Show
          janhoy Jan Høydahl added a comment - I think your suspicion about mm defaulting to 100% is correct. Not entirely. The method DisMaxQParser#parseMinShouldMatch "Applies the appropriate default rules for the "mm" param based on the effective value of the "q.op" param", i.e. if mm is not explicitly set, it is set i 0 for q.op=OR and 100% for q.op=AND.
          Hide
          erickerickson Erick Erickson added a comment -

          OK, just to see if we're all on the same page. I can make this not a blocker since we have a work-around (set mm=0 when mixing q.op=AND with OR top-level clauses) and work is ongoing to clarify the interaction between q.op and mm.

          The code Jan pointed out sure makes this confusing. Am I correct in that the OR is honored (without setting mm=0) in a query like q.op=AND&q=A and (B OR C)?

          Show
          erickerickson Erick Erickson added a comment - OK, just to see if we're all on the same page. I can make this not a blocker since we have a work-around (set mm=0 when mixing q.op=AND with OR top-level clauses) and work is ongoing to clarify the interaction between q.op and mm. The code Jan pointed out sure makes this confusing. Am I correct in that the OR is honored (without setting mm=0) in a query like q.op=AND&q=A and (B OR C)?
          Hide
          janhoy Jan Høydahl added a comment - - edited

          I can make this not a blocker since we have a work-around

          Isn't it premature to proclaim a generic workaround, at least without more test coverage to prove so?
          The majority of edismax users have not upgraded to 5.5.0 yet, and for many of them, existing applications and existing queries will start producing wrong results. If there is a generic workaround to add mm=0 in solrconfig.xml to get back the correct behavior, we can declare a workaround. But we cannot assume it is acceptable for people to write custom logic to conditionally change the request depending on how the user-entered query looks like.
          That's why I lean towards reverting, since a revert will only affect the very few 5.5.0 users who rely upon the new stuff. And they have a workaround in building a custom Solr version with the patch, or waiting until 5.5.x or 6.x where a proper fix is introduced.

          Show
          janhoy Jan Høydahl added a comment - - edited I can make this not a blocker since we have a work-around Isn't it premature to proclaim a generic workaround, at least without more test coverage to prove so? The majority of edismax users have not upgraded to 5.5.0 yet, and for many of them, existing applications and existing queries will start producing wrong results. If there is a generic workaround to add mm=0 in solrconfig.xml to get back the correct behavior, we can declare a workaround. But we cannot assume it is acceptable for people to write custom logic to conditionally change the request depending on how the user-entered query looks like. That's why I lean towards reverting, since a revert will only affect the very few 5.5.0 users who rely upon the new stuff. And they have a workaround in building a custom Solr version with the patch, or waiting until 5.5.x or 6.x where a proper fix is introduced.
          Hide
          gpendleb Greg Pendlebury added a comment -

          I don't know that what we are talking about here is a 'workaround' at all. Solr is doing exactly what it is being asked to do. I know it is disrupting an existing user base, so it warrants discussion and maybe even a 'fix'... but the existing user base were leaving a non-configured parameter at its default value (which probably didn't match their use case) and it only worked because the parameter was being ignored by edismax. The fact that parameter was ignored introduced the real bugs in SOLR-2649.

          I think there has always been confusion over how this works under the hood, and that still continues. q.op and mm apply to two different parts of the query, and each of them has other factors that come into play.

          • q.op is a boolean operator, which happens pre-parse (or in the very earliest stages of parsing)
          • mm applies to (top level) clauses which have the SHOULD occur flag after Solr translates all the boolean operators
          • if mm is not explicitly set, the default value is determined by q.op (? I haven't verified this, but that is Jan's input above). The old doco says it is always 100% default... but I personally have always set it explicitly... no experience.
          • Solr translates boolean operators into occurs flags differently depending on the value of q.op. In particular q.op=AND causes non-intuitive generation of occurs flags if looked at from a purely boolean perspective.
          • mm does not make much sense at all if you think about search as a purely boolean query (ie. the result either matches or doesn't) instead of occurs flags (ie. the score of the result is either higher or lower)

          So now that SOLR-2649 has come along, it slightly muddies the water because:

          • q.op is no longer hard coded to OR. Pre-patch the user could say q.op=AND, but it didn't do anything to the query
          • The presence of an operator no longer turns off the mm feature

          My take on the issue is that users who want to use boolean operators in edismax should pay attention to the mm parameter, and make sure their choice matches their use case. Previously they didn't have to... but the presence of the boolean operators when using edismax was buggy (? debatable... it has been argued that it simply wasn't the use case edismax was first written for).

          Having said that, IF anything was to change, I would simply play subtly with choosing the default value of mm. Maybe something like this:

          IF (the query contains a boolean operator) AND (mm has not been explicitly set) THEN (mm = 0%)

          It is a tweak on the work Jan did in SOLR-2649, so that instead of turning off mm in response to a boolean operator being present, we instead influence the default value. We still let users ultimately set up their parameters however they want though. If the user has a use case that includes both boolean parameters and mm logic... have fun.

          Show
          gpendleb Greg Pendlebury added a comment - I don't know that what we are talking about here is a 'workaround' at all. Solr is doing exactly what it is being asked to do. I know it is disrupting an existing user base, so it warrants discussion and maybe even a 'fix'... but the existing user base were leaving a non-configured parameter at its default value (which probably didn't match their use case) and it only worked because the parameter was being ignored by edismax. The fact that parameter was ignored introduced the real bugs in SOLR-2649 . I think there has always been confusion over how this works under the hood, and that still continues. q.op and mm apply to two different parts of the query, and each of them has other factors that come into play. q.op is a boolean operator, which happens pre-parse (or in the very earliest stages of parsing) mm applies to (top level) clauses which have the SHOULD occur flag after Solr translates all the boolean operators if mm is not explicitly set, the default value is determined by q.op (? I haven't verified this, but that is Jan's input above). The old doco says it is always 100% default... but I personally have always set it explicitly... no experience. Solr translates boolean operators into occurs flags differently depending on the value of q.op. In particular q.op=AND causes non-intuitive generation of occurs flags if looked at from a purely boolean perspective. mm does not make much sense at all if you think about search as a purely boolean query (ie. the result either matches or doesn't) instead of occurs flags (ie. the score of the result is either higher or lower) So now that SOLR-2649 has come along, it slightly muddies the water because: q.op is no longer hard coded to OR. Pre-patch the user could say q.op=AND, but it didn't do anything to the query The presence of an operator no longer turns off the mm feature My take on the issue is that users who want to use boolean operators in edismax should pay attention to the mm parameter, and make sure their choice matches their use case . Previously they didn't have to... but the presence of the boolean operators when using edismax was buggy (? debatable... it has been argued that it simply wasn't the use case edismax was first written for). Having said that, IF anything was to change, I would simply play subtly with choosing the default value of mm. Maybe something like this: IF (the query contains a boolean operator) AND (mm has not been explicitly set) THEN (mm = 0%) It is a tweak on the work Jan did in SOLR-2649 , so that instead of turning off mm in response to a boolean operator being present, we instead influence the default value. We still let users ultimately set up their parameters however they want though. If the user has a use case that includes both boolean parameters and mm logic... have fun.
          Hide
          janhoy Jan Høydahl added a comment -

          Greg Pendlebury looks like you have a good command on the logic and interactions here, so feel free to throw up a patch...

          IF (the query contains a boolean operator) AND (mm has not been explicitly set) THEN (mm = 0%)

          In my experience some users may set mm=100% while others achieved the same with q.op=AND. Advising users to favor the use of only q.op to get boolean operators working makes the distinction hard to explain.

          I'm positive to applying a fix that focuses on getting the pure AND/OR 0%/100% case fixed (if that could be achieved with your proposed method), and then worry about other mm's combined with explicit operators later.

          Show
          janhoy Jan Høydahl added a comment - Greg Pendlebury looks like you have a good command on the logic and interactions here, so feel free to throw up a patch... IF (the query contains a boolean operator) AND (mm has not been explicitly set) THEN (mm = 0%) In my experience some users may set mm=100% while others achieved the same with q.op=AND. Advising users to favor the use of only q.op to get boolean operators working makes the distinction hard to explain. I'm positive to applying a fix that focuses on getting the pure AND/OR 0%/100% case fixed (if that could be achieved with your proposed method), and then worry about other mm's combined with explicit operators later.
          Hide
          gpendleb Greg Pendlebury added a comment -

          Ok, I will try to find some time over the next week or so. I freely confess it doesn't look great on a Friday afternoon and school holidays begin here after next week. It might be a rough contribution someone else can carry over the line.

          With regards to mixed cases of q.op and mm where users are explicitly setting them, I think they are already covered if you look in the unit test testDefaultOperatorWithMm(). The problem here seems to be the use case where people do not explicitly set mm and fall back to the default. This is treading on some expected behaviour from existing users.

          Show
          gpendleb Greg Pendlebury added a comment - Ok, I will try to find some time over the next week or so. I freely confess it doesn't look great on a Friday afternoon and school holidays begin here after next week. It might be a rough contribution someone else can carry over the line. With regards to mixed cases of q.op and mm where users are explicitly setting them, I think they are already covered if you look in the unit test testDefaultOperatorWithMm(). The problem here seems to be the use case where people do not explicitly set mm and fall back to the default. This is treading on some expected behaviour from existing users.
          Hide
          gpendleb Greg Pendlebury added a comment -

          Attaching possible 'fix' that defaults mm to 0% if the users has declared no explicit mm, but has boolean operators in their query.

          First time I have generated a patch using git, so hopefully it is ok.

          Show
          gpendleb Greg Pendlebury added a comment - Attaching possible 'fix' that defaults mm to 0% if the users has declared no explicit mm, but has boolean operators in their query. First time I have generated a patch using git, so hopefully it is ok.
          Hide
          gpendleb Greg Pendlebury added a comment -

          I also confirmed (for my own sanity) that q.op does indeed influence the default value of mm, as per Jan Høydahl. Personally I don't like that, and perhaps it isn't relevant anymore since SOLR-2649... but I left it alone.

          Show
          gpendleb Greg Pendlebury added a comment - I also confirmed (for my own sanity) that q.op does indeed influence the default value of mm, as per Jan Høydahl . Personally I don't like that, and perhaps it isn't relevant anymore since SOLR-2649 ... but I left it alone.
          Hide
          janhoy Jan Høydahl added a comment -

          Should we add a test case for Ryan Josal's issue above as well?

          "hair ties -barbie" should return hair ties but not barbie hair ties, and now it matches nothing.

          Show
          janhoy Jan Høydahl added a comment - Should we add a test case for Ryan Josal 's issue above as well? "hair ties -barbie" should return hair ties but not barbie hair ties, and now it matches nothing.
          Hide
          erickerickson Erick Erickson added a comment -

          We've got something of a time problem here. 6.0 is being released (there was an RC1, but it's being re-spun) and this JIRA is marked as a blocker.

          I don't think that this should be a blocker for 6.0 since it's already in 5.5. And I'm reluctant to hurry a fix in to 6.0 at this late date given the discussion so far. I suspect that practically 6.0 won't be rushed into production before we get a fix for this issue (6.0.1 or 6.1 or whatever).

          We should fix it ASAP, but not at the expense of holding up 6.0.

          Thoughts?

          Show
          erickerickson Erick Erickson added a comment - We've got something of a time problem here. 6.0 is being released (there was an RC1, but it's being re-spun) and this JIRA is marked as a blocker. I don't think that this should be a blocker for 6.0 since it's already in 5.5. And I'm reluctant to hurry a fix in to 6.0 at this late date given the discussion so far. I suspect that practically 6.0 won't be rushed into production before we get a fix for this issue (6.0.1 or 6.1 or whatever). We should fix it ASAP, but not at the expense of holding up 6.0. Thoughts?
          Hide
          ichattopadhyaya Ishan Chattopadhyaya added a comment -

          How about initiating a 5.5.1 and 6.0.1 release immediately after the 6.0 release?

          Show
          ichattopadhyaya Ishan Chattopadhyaya added a comment - How about initiating a 5.5.1 and 6.0.1 release immediately after the 6.0 release?
          Hide
          gpendleb Greg Pendlebury added a comment -

          Adding a 'hair ties -barbie' example to unit tests. Not sure it demonstrates anything new, but it does work as I would expect.

          I can't get git to generate a combined patch the way I would have in svn... my git-fu is weak.

          Show
          gpendleb Greg Pendlebury added a comment - Adding a 'hair ties -barbie' example to unit tests. Not sure it demonstrates anything new, but it does work as I would expect. I can't get git to generate a combined patch the way I would have in svn... my git-fu is weak.
          Hide
          gpendleb Greg Pendlebury added a comment -

          Erick Erickson, personally, I am ambivalent with regards to timing and versions. I am still not convinced there is actually an issue here, but I don't want to be a dick and dismiss it out-of-hand.

          The patches provided are simply about choosing default parameter values that disrupt the least number of users who did not have mm set to an appropriate value. Any user (risky, broad generalisation incoming) who puts a boolean OR operator into an edismax query string would not want mm=100%, but that is what is happening here.

          Show
          gpendleb Greg Pendlebury added a comment - Erick Erickson , personally, I am ambivalent with regards to timing and versions. I am still not convinced there is actually an issue here, but I don't want to be a dick and dismiss it out-of-hand. The patches provided are simply about choosing default parameter values that disrupt the least number of users who did not have mm set to an appropriate value. Any user (risky, broad generalisation incoming) who puts a boolean OR operator into an edismax query string would not want mm=100%, but that is what is happening here.
          Hide
          erickerickson Erick Erickson added a comment -

          Since this hasn't really reached any sort of resolution, and since this behavior is out there in the field already I changed its status. I'll keep it on my back burner.

          I suspect that Jan's comments about beefing up the test cases is well taken. When I Get Some Time (tm) I'll see about beefing up the test cases (all volunteers welcome). I think it'll be easier to discuss specific test cases and whether the behavior they're testing is the desired behavior...

          Show
          erickerickson Erick Erickson added a comment - Since this hasn't really reached any sort of resolution, and since this behavior is out there in the field already I changed its status. I'll keep it on my back burner. I suspect that Jan's comments about beefing up the test cases is well taken. When I Get Some Time (tm) I'll see about beefing up the test cases (all volunteers welcome). I think it'll be easier to discuss specific test cases and whether the behavior they're testing is the desired behavior...
          Hide
          janhoy Jan Høydahl added a comment -

          Relevant user@ email thread, confirming the impression that this indeed must be treated as a bug: http://search-lucene.com/m/eHNlru3hR1b74Oz&subj=How+can+I+set+the+defaultOperator+to+be+AND+

          Show
          janhoy Jan Høydahl added a comment - Relevant user@ email thread, confirming the impression that this indeed must be treated as a bug: http://search-lucene.com/m/eHNlru3hR1b74Oz&subj=How+can+I+set+the+defaultOperator+to+be+AND+
          Hide
          stephlag Stephan Lagraulet added a comment -

          Can you remove this issue from version 5.5.1 as this version is packaged without a fix for this issue?

          Show
          stephlag Stephan Lagraulet added a comment - Can you remove this issue from version 5.5.1 as this version is packaged without a fix for this issue?
          Hide
          steve_rowe Steve Rowe added a comment -

          Patch combining both of Greg Pendlebury's patches, beefing up the tests, simplifying the change a bit, and putting things back where they were prior to SOLR-2649.

          Some of the comment text was skeptical about what's currently supported, so I changed it to more neutral language. I also tried to regularize the failure text in the tests.

          I also added one test to demonstrate that David Smiley's SOLR-9174 mm issue is (still) handled properly.

          All Solr tests and procommit passes. Committing shortly.

          Show
          steve_rowe Steve Rowe added a comment - Patch combining both of Greg Pendlebury 's patches, beefing up the tests, simplifying the change a bit, and putting things back where they were prior to SOLR-2649 . Some of the comment text was skeptical about what's currently supported, so I changed it to more neutral language. I also tried to regularize the failure text in the tests. I also added one test to demonstrate that David Smiley 's SOLR-9174 mm issue is (still) handled properly. All Solr tests and procommit passes. Committing shortly.
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 5bc34949adc911dacba29c08f1e522de4679aee6 in lucene-solr's branch refs/heads/branch_5_5 from Steve Rowe
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=5bc3494 ]

          SOLR-8812: edismax: turn off mm processing if no explicit mm spec is provided and there are explicit operators (except for AND) - addresses problems caused by SOLR-2649

          Show
          jira-bot ASF subversion and git services added a comment - Commit 5bc34949adc911dacba29c08f1e522de4679aee6 in lucene-solr's branch refs/heads/branch_5_5 from Steve Rowe [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=5bc3494 ] SOLR-8812 : edismax: turn off mm processing if no explicit mm spec is provided and there are explicit operators (except for AND) - addresses problems caused by SOLR-2649
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 508f8f83e1ed3bb8c7f8ca37d767e95b5b7b87c3 in lucene-solr's branch refs/heads/branch_5_5 from Steve Rowe
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=508f8f8 ]

          SOLR-8812: CHANGES.txt entry

          Show
          jira-bot ASF subversion and git services added a comment - Commit 508f8f83e1ed3bb8c7f8ca37d767e95b5b7b87c3 in lucene-solr's branch refs/heads/branch_5_5 from Steve Rowe [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=508f8f8 ] SOLR-8812 : CHANGES.txt entry
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit e88a64a0490ac2139e0aa55a8ca8b3588ce41395 in lucene-solr's branch refs/heads/branch_5x from Steve Rowe
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=e88a64a ]

          SOLR-8812: edismax: turn off mm processing if no explicit mm spec is provided and there are explicit operators (except for AND) - addresses problems caused by SOLR-2649

          Show
          jira-bot ASF subversion and git services added a comment - Commit e88a64a0490ac2139e0aa55a8ca8b3588ce41395 in lucene-solr's branch refs/heads/branch_5x from Steve Rowe [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=e88a64a ] SOLR-8812 : edismax: turn off mm processing if no explicit mm spec is provided and there are explicit operators (except for AND) - addresses problems caused by SOLR-2649
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 1dff75c643f17c5447264fecb9b71530e05f017f in lucene-solr's branch refs/heads/branch_5x from Steve Rowe
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=1dff75c ]

          SOLR-8812: CHANGES.txt entry

          Show
          jira-bot ASF subversion and git services added a comment - Commit 1dff75c643f17c5447264fecb9b71530e05f017f in lucene-solr's branch refs/heads/branch_5x from Steve Rowe [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=1dff75c ] SOLR-8812 : CHANGES.txt entry
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 202b993335a370be06db02b866c62a876350b01d in lucene-solr's branch refs/heads/branch_6_0 from Steve Rowe
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=202b993 ]

          SOLR-8812: edismax: turn off mm processing if no explicit mm spec is provided and there are explicit operators (except for AND) - addresses problems caused by SOLR-2649

          Show
          jira-bot ASF subversion and git services added a comment - Commit 202b993335a370be06db02b866c62a876350b01d in lucene-solr's branch refs/heads/branch_6_0 from Steve Rowe [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=202b993 ] SOLR-8812 : edismax: turn off mm processing if no explicit mm spec is provided and there are explicit operators (except for AND) - addresses problems caused by SOLR-2649
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit d5712fea6058ed5fe905f4e157466d117a1f38c0 in lucene-solr's branch refs/heads/branch_6_0 from Steve Rowe
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=d5712fe ]

          SOLR-8812: CHANGES.txt entry

          Show
          jira-bot ASF subversion and git services added a comment - Commit d5712fea6058ed5fe905f4e157466d117a1f38c0 in lucene-solr's branch refs/heads/branch_6_0 from Steve Rowe [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=d5712fe ] SOLR-8812 : CHANGES.txt entry
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit c2aac7d005424094e129045e834aa6ffb2c7aa82 in lucene-solr's branch refs/heads/branch_6_1 from Steve Rowe
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=c2aac7d ]

          SOLR-8812: edismax: turn off mm processing if no explicit mm spec is provided and there are explicit operators (except for AND) - addresses problems caused by SOLR-2649

          Show
          jira-bot ASF subversion and git services added a comment - Commit c2aac7d005424094e129045e834aa6ffb2c7aa82 in lucene-solr's branch refs/heads/branch_6_1 from Steve Rowe [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=c2aac7d ] SOLR-8812 : edismax: turn off mm processing if no explicit mm spec is provided and there are explicit operators (except for AND) - addresses problems caused by SOLR-2649
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit e0c6762de8dd91e02a79a6d12b3164a052bf07ff in lucene-solr's branch refs/heads/branch_6_1 from Steve Rowe
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=e0c6762 ]

          SOLR-8812: edismax: turn off mm processing if no explicit mm spec is provided and there are explicit operators (except for AND) - addresses problems caused by SOLR-2649

          Show
          jira-bot ASF subversion and git services added a comment - Commit e0c6762de8dd91e02a79a6d12b3164a052bf07ff in lucene-solr's branch refs/heads/branch_6_1 from Steve Rowe [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=e0c6762 ] SOLR-8812 : edismax: turn off mm processing if no explicit mm spec is provided and there are explicit operators (except for AND) - addresses problems caused by SOLR-2649
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 3c789e9d274fb670c53c8eb6eb9dfceace2cd120 in lucene-solr's branch refs/heads/branch_6x from Steve Rowe
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=3c789e9 ]

          SOLR-8812: edismax: turn off mm processing if no explicit mm spec is provided and there are explicit operators (except for AND) - addresses problems caused by SOLR-2649

          Show
          jira-bot ASF subversion and git services added a comment - Commit 3c789e9d274fb670c53c8eb6eb9dfceace2cd120 in lucene-solr's branch refs/heads/branch_6x from Steve Rowe [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=3c789e9 ] SOLR-8812 : edismax: turn off mm processing if no explicit mm spec is provided and there are explicit operators (except for AND) - addresses problems caused by SOLR-2649
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 74d94ec26585d57d6991f99c892f7be1278346f9 in lucene-solr's branch refs/heads/branch_6x from Steve Rowe
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=74d94ec ]

          SOLR-8812: edismax: turn off mm processing if no explicit mm spec is provided and there are explicit operators (except for AND) - addresses problems caused by SOLR-2649

          Show
          jira-bot ASF subversion and git services added a comment - Commit 74d94ec26585d57d6991f99c892f7be1278346f9 in lucene-solr's branch refs/heads/branch_6x from Steve Rowe [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=74d94ec ] SOLR-8812 : edismax: turn off mm processing if no explicit mm spec is provided and there are explicit operators (except for AND) - addresses problems caused by SOLR-2649
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 38714399760889d2d7b188a87341aade6139ffef in lucene-solr's branch refs/heads/master from Steve Rowe
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=3871439 ]

          SOLR-8812: edismax: turn off mm processing if no explicit mm spec is provided and there are explicit operators (except for AND) - addresses problems caused by SOLR-2649

          Show
          jira-bot ASF subversion and git services added a comment - Commit 38714399760889d2d7b188a87341aade6139ffef in lucene-solr's branch refs/heads/master from Steve Rowe [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=3871439 ] SOLR-8812 : edismax: turn off mm processing if no explicit mm spec is provided and there are explicit operators (except for AND) - addresses problems caused by SOLR-2649
          Hide
          gpendleb Greg Pendlebury added a comment -

          Sounds (tentatively) ok to me. I was quite concerned when you said it puts things back to pre-SOLR-2649 functionality, but from looking at what got committed it seems that q.op=OR is no longer hardcoded in setDefaultOperator() (which was fixed in SOLR-2649). I haven't executed anything, but this seems like a good step with regards to mm handling.

          Show
          gpendleb Greg Pendlebury added a comment - Sounds (tentatively) ok to me. I was quite concerned when you said it puts things back to pre- SOLR-2649 functionality, but from looking at what got committed it seems that q.op=OR is no longer hardcoded in setDefaultOperator() (which was fixed in SOLR-2649 ). I haven't executed anything, but this seems like a good step with regards to mm handling.
          Hide
          dsmiley David Smiley added a comment -

          Thanks so much for getting this done Steve Rowe (and others)!

          Show
          dsmiley David Smiley added a comment - Thanks so much for getting this done Steve Rowe (and others)!
          Hide
          erickerickson Erick Erickson added a comment -

          Big thanks Steve! This has been contributing to my "guilt load" for a really long time.

          Show
          erickerickson Erick Erickson added a comment - Big thanks Steve! This has been contributing to my "guilt load" for a really long time.
          Hide
          steve_rowe Steve Rowe added a comment -

          Sounds (tentatively) ok to me. I was quite concerned when you said it puts things back to pre-SOLR-2649 functionality, but from looking at what got committed it seems that q.op=OR is no longer hardcoded in setDefaultOperator() (which was fixed in SOLR-2649). I haven't executed anything, but this seems like a good step with regards to mm handling.

          Greg, what I committed is behaviorally the same as your patch - check the unit tests: I included all of the ones you wrote, but added more to cover all permutations of operator and q.op.

          What I meant by "put things back" was that I attempted to minimize the diff between what I committed and the pre-SOLR-2649 code, so I put your foundOperators() back in the same location that it was when it was named doMinMatched() - your patch put it in a different location in the source file.

          Thanks for the work you did on this issue!

          Show
          steve_rowe Steve Rowe added a comment - Sounds (tentatively) ok to me. I was quite concerned when you said it puts things back to pre- SOLR-2649 functionality, but from looking at what got committed it seems that q.op=OR is no longer hardcoded in setDefaultOperator() (which was fixed in SOLR-2649 ). I haven't executed anything, but this seems like a good step with regards to mm handling. Greg, what I committed is behaviorally the same as your patch - check the unit tests: I included all of the ones you wrote, but added more to cover all permutations of operator and q.op . What I meant by "put things back" was that I attempted to minimize the diff between what I committed and the pre- SOLR-2649 code, so I put your foundOperators() back in the same location that it was when it was named doMinMatched() - your patch put it in a different location in the source file. Thanks for the work you did on this issue!
          Hide
          gpendleb Greg Pendlebury added a comment -

          Sounds great. Add my thanks to the those you've already received.

          Show
          gpendleb Greg Pendlebury added a comment - Sounds great. Add my thanks to the those you've already received.
          Hide
          janhoy Jan Høydahl added a comment -

          Thanks Steve for pushing this through!

          Show
          janhoy Jan Høydahl added a comment - Thanks Steve for pushing this through!

            People

            • Assignee:
              steve_rowe Steve Rowe
              Reporter:
              ryanmax Ryan Steinberg
            • Votes:
              8 Vote for this issue
              Watchers:
              20 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development