Solr
  1. Solr
  2. SOLR-6780

some param values are duplicated when they override defaults, or are combined with appends values, or are an invariant that overrides a request param

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 4.1, 5.0, 6.0
    • Fix Version/s: 4.10.4, 5.0, 6.0
    • Component/s: None
    • Labels:

      Description

      The DefaultSolrParams class, which is used as the basis for the implementation of defaults, appends and invariants params had a bug in it's implementation of getParameterNamesIterator() that could result in the same param key being returned multiple times.

      In many code paths of Solr, this bug had no effects – but in other cases, it resulted in code which iterated over the list of all parameters to take action multiple times for the (valid) key=value pairs.

      There were 4 main areas where this bug had unexpected & problematic behavior for end users:

      main problem areas & impacts
      • ExtractingRequestHandler
        • "literal.*" params will be duplicated if overridden by defaults/invariants/appends - this will result in redundent literal field=value params being added to the document.
        • impact: multiple values in literal fields when not expected/desired
      • FacetComponent
        • "facet.*" params will be duplicated if overridden by defaults/invariants/appends - this can result in redundent computation and identical facet.field, facet.query, or facet.range blocks in the response
        • impact: wasted computation & increased response size
      • SpellCheckComponent
        • when "custom params" (ie: "spellcheck.[dictionary name].XXXX=YYYY" are used in used in defaults, appends, or invariants, it can cause redudent XXXXX=YYYY params to be used.
        • when "spellcheck.collateParam.XXXX=YYYY" type params are used defaults, appends, or invariants, it can cause redundent XXXX=YYYY params to exist in the collation verification queries.
        • impact: unclear to me at first glance, probably just wasted computation & increased response size
      • AnalyticsComponent
        • "olap.*" params will be duplicated if overridden by defaults/invariants/appends - this can result in redundent computation
        • impact: unclear to me at first glance, probably just wasted computation & increased response size

      Other less serious impacts were redundent values in "echoParams" as well as some small amounts of wasted computation in other code paths that iterated over the set of params (due to a slightly larger set of param values)

      Original bug report: "Merging request parameters with defaults produce duplicate entries"

      When a parameter (e.g. echoParams) is specified and overrides the default on the handler, it actually generates two entries for that key with the same value.

      Most of the time it is just a confusion and not an issue, however, some components will do the work twice. For example faceting component as described in http://search-lucene.com/m/QTPaSlFUQ1/duplicate

      It may also be connected to SOLR-6369

      The cause seems to be the interplay between DefaultSolrParams#getParameterNamesIterator() which just returns param names in sequence and SolrParams#toNamedList() which uses the first (override then default) value for each key, without deduplication.

      It's easily reproducible in trunk against schemaless example with

      curl "http://localhost:8983/solr/schemaless/select?indent=true&echoParams=all"

      I've also spot checked it and it seems to be reproducible back to Solr 4.1.

        Issue Links

          Activity

          Hide
          Hoss Man added a comment -

          What an evil freaking bug.

          I audited all of the usages of getParameterNamesIterator() to try and track down how severe the impacts of this issue may be – most of them are either semi-benign (like the "echoParams" case) or not affected by the redundency (ie: code that iterats over all params looking for certain things, and then adds the assocaited names/values to a Set so they get deduped anyway)

          There were 4 main areas i found where this bug could result in problematic behavior

          • ExtractingRequestHandler
            • "literal.*" params will be duplicated if overridden by defaults/invariants/appends - this will result in redundent literal field=value params being added to the document.
            • impact: multiple values in literal fields when not expected/desired
          • FacetComponent
            • "facet.*" params will be duplicated if overridden by defaults/invariants/appends - this can result in redundent computation and identical facet.field, facet.query, or facet.range blocks in the response
            • impact: wasted computation & increased response size
          • SpellCheckComponent
            • when "custom params" (ie: "spellcheck.[dictionary name].XXXX=YYYY" are used in used in defaults, appends, or invariants, it can cause redudent XXXXX=YYYY params to be used.
            • when "spellcheck.collateParam.XXXX=YYYY" type params are used defaults, appends, or invariants, it can cause redundent XXXX=YYYY params to exist in the collation verification queries.
            • impact: unclear to me at first glance, probably just wasted computation & increased response size
          • AnalyticsComponent
            • "olap.*" params will be duplicated if overridden by defaults/invariants/appends - this can result in redundent computation
            • impact: unclear to me at first glance, probably just wasted computation & increased response size

          ...in addition to fixing the bug & adding explicit unit tests for it, the attached patch also includes some sanity check testing for the FacetComponent & ExtractingRequestHandler situations above, to try and protect us from similar "redundency" bugs in the future.

          Show
          Hoss Man added a comment - What an evil freaking bug. I audited all of the usages of getParameterNamesIterator() to try and track down how severe the impacts of this issue may be – most of them are either semi-benign (like the "echoParams" case) or not affected by the redundency (ie: code that iterats over all params looking for certain things, and then adds the assocaited names/values to a Set so they get deduped anyway) There were 4 main areas i found where this bug could result in problematic behavior ExtractingRequestHandler "literal.*" params will be duplicated if overridden by defaults/invariants/appends - this will result in redundent literal field=value params being added to the document. impact: multiple values in literal fields when not expected/desired FacetComponent "facet.*" params will be duplicated if overridden by defaults/invariants/appends - this can result in redundent computation and identical facet.field, facet.query, or facet.range blocks in the response impact: wasted computation & increased response size SpellCheckComponent when "custom params" (ie: "spellcheck.[dictionary name].XXXX=YYYY" are used in used in defaults, appends, or invariants, it can cause redudent XXXXX=YYYY params to be used. when "spellcheck.collateParam.XXXX=YYYY" type params are used defaults, appends, or invariants, it can cause redundent XXXX=YYYY params to exist in the collation verification queries. impact: unclear to me at first glance, probably just wasted computation & increased response size AnalyticsComponent "olap.*" params will be duplicated if overridden by defaults/invariants/appends - this can result in redundent computation impact: unclear to me at first glance, probably just wasted computation & increased response size ...in addition to fixing the bug & adding explicit unit tests for it, the attached patch also includes some sanity check testing for the FacetComponent & ExtractingRequestHandler situations above, to try and protect us from similar "redundency" bugs in the future.
          Hide
          Hoss Man added a comment -

          planning to commit & backport to 5x today ... hoping to backport to for 4.10.3 tomorow but i want to see it run in jenkins at least 24 hours.

          Show
          Hoss Man added a comment - planning to commit & backport to 5x today ... hoping to backport to for 4.10.3 tomorow but i want to see it run in jenkins at least 24 hours.
          Hide
          ASF subversion and git services added a comment -

          Commit 1642727 from hossman@apache.org in branch 'dev/trunk'
          [ https://svn.apache.org/r1642727 ]

          SOLR-6780: Fixed a bug in how default/appends/invariants params were affecting the set of all keys found in the request parameters, resulting in some key=value param pairs being duplicated.

          Show
          ASF subversion and git services added a comment - Commit 1642727 from hossman@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1642727 ] SOLR-6780 : Fixed a bug in how default/appends/invariants params were affecting the set of all keys found in the request parameters, resulting in some key=value param pairs being duplicated.
          Hide
          ASF subversion and git services added a comment -

          Commit 1642760 from hossman@apache.org in branch 'dev/branches/branch_5x'
          [ https://svn.apache.org/r1642760 ]

          SOLR-6780: Fixed a bug in how default/appends/invariants params were affecting the set of all keys found in the request parameters, resulting in some key=value param pairs being duplicated. (merge r1642740)

          Show
          ASF subversion and git services added a comment - Commit 1642760 from hossman@apache.org in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1642760 ] SOLR-6780 : Fixed a bug in how default/appends/invariants params were affecting the set of all keys found in the request parameters, resulting in some key=value param pairs being duplicated. (merge r1642740)
          Hide
          Hoss Man added a comment -

          This is committed to trunk & 5x.

          Backport to the 4.10.x branch was clean w/o any precommit problems – still running tests. I'm going to hold off on committing the 4.10.x backport for 24 hours to give jenkins some time to hate me.

          I've also updated the summary & description of hte bug to be more focused on the user impacts, to aid in people searching in the future.

          Show
          Hoss Man added a comment - This is committed to trunk & 5x. Backport to the 4.10.x branch was clean w/o any precommit problems – still running tests. I'm going to hold off on committing the 4.10.x backport for 24 hours to give jenkins some time to hate me. I've also updated the summary & description of hte bug to be more focused on the user impacts, to aid in people searching in the future.
          Hide
          ASF subversion and git services added a comment -

          Commit 1643006 from hossman@apache.org in branch 'dev/branches/lucene_solr_4_10'
          [ https://svn.apache.org/r1643006 ]

          SOLR-6780: Fixed a bug in how default/appends/invariants params were affecting the set of all keys found in the request parameters, resulting in some key=value param pairs being duplicated (merge r1642727)

          Show
          ASF subversion and git services added a comment - Commit 1643006 from hossman@apache.org in branch 'dev/branches/lucene_solr_4_10' [ https://svn.apache.org/r1643006 ] SOLR-6780 : Fixed a bug in how default/appends/invariants params were affecting the set of all keys found in the request parameters, resulting in some key=value param pairs being duplicated (merge r1642727)
          Hide
          Marius Dumitru Florea added a comment -

          This issue's Fix Version/s includes 4.10.3 but the fix is not present on the correspoding tag. It looks like the original commit from the trunk was not correctly merged on the 4.10 branch as r1643006 doesn't contain the files that have been actually modified. The merge on the 5x branch seems fine on the other hand, as r1642760 contains the changed files (DefaultSolrParams most importantly).

          Can we have this issue properly fixed on the 4.10 branch? Especialy since this is a regression: I didn't have the facet fields listed twice on 4.8.1. Thanks.

          Show
          Marius Dumitru Florea added a comment - This issue's Fix Version/s includes 4.10.3 but the fix is not present on the correspoding tag . It looks like the original commit from the trunk was not correctly merged on the 4.10 branch as r1643006 doesn't contain the files that have been actually modified. The merge on the 5x branch seems fine on the other hand, as r1642760 contains the changed files (DefaultSolrParams most importantly). Can we have this issue properly fixed on the 4.10 branch? Especialy since this is a regression: I didn't have the facet fields listed twice on 4.8.1. Thanks.
          Hide
          Hoss Man added a comment -

          DAMN IT!

          I'm really sorry about this – i'm not sure what happened with this merge ... this sucks.

          i'll get this fixed on the 4.10 branch, and get the CHANGES.txt file updated so 5.0 accurately records when the fix was first released.

          Show
          Hoss Man added a comment - DAMN IT! I'm really sorry about this – i'm not sure what happened with this merge ... this sucks. i'll get this fixed on the 4.10 branch, and get the CHANGES.txt file updated so 5.0 accurately records when the fix was first released.
          Hide
          ASF subversion and git services added a comment -

          Commit 1657314 from hossman@apache.org in branch 'dev/trunk'
          [ https://svn.apache.org/r1657314 ]

          SOLR-6780: issue did not get backported to 4x branch correctly, fixing CHANGES to accurately record that 5.0 will be first release with fix

          Show
          ASF subversion and git services added a comment - Commit 1657314 from hossman@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1657314 ] SOLR-6780 : issue did not get backported to 4x branch correctly, fixing CHANGES to accurately record that 5.0 will be first release with fix
          Hide
          ASF subversion and git services added a comment -

          Commit 1657316 from hossman@apache.org in branch 'dev/branches/branch_5x'
          [ https://svn.apache.org/r1657316 ]

          SOLR-6780: issue did not get backported to 4x branch correctly, fixing CHANGES to accurately record that 5.0 will be first release with fix (merge r1657314)

          Show
          ASF subversion and git services added a comment - Commit 1657316 from hossman@apache.org in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1657316 ] SOLR-6780 : issue did not get backported to 4x branch correctly, fixing CHANGES to accurately record that 5.0 will be first release with fix (merge r1657314)
          Hide
          ASF subversion and git services added a comment -

          Commit 1657317 from hossman@apache.org in branch 'dev/branches/lucene_solr_5_0'
          [ https://svn.apache.org/r1657317 ]

          SOLR-6780: issue did not get backported to 4x branch correctly, fixing CHANGES to accurately record that 5.0 will be first release with fix (merge r1657314)

          Show
          ASF subversion and git services added a comment - Commit 1657317 from hossman@apache.org in branch 'dev/branches/lucene_solr_5_0' [ https://svn.apache.org/r1657317 ] SOLR-6780 : issue did not get backported to 4x branch correctly, fixing CHANGES to accurately record that 5.0 will be first release with fix (merge r1657314)
          Hide
          ASF subversion and git services added a comment -

          Commit 1657364 from hossman@apache.org in branch 'dev/branches/lucene_solr_4_10'
          [ https://svn.apache.org/r1657364 ]

          SOLR-6780: Fixed a bug in how default/appends/invariants params were affecting the set of all keys found in the request parameters, resulting in some key=value param pairs being duplicated. (merge r1642727, r1657314)

          Show
          ASF subversion and git services added a comment - Commit 1657364 from hossman@apache.org in branch 'dev/branches/lucene_solr_4_10' [ https://svn.apache.org/r1657364 ] SOLR-6780 : Fixed a bug in how default/appends/invariants params were affecting the set of all keys found in the request parameters, resulting in some key=value param pairs being duplicated. (merge r1642727, r1657314)
          Hide
          Hoss Man added a comment -

          Ok, this should now be fixed properly on the 4_10 branch, and the CHANGES.txt should be correct moving forward.

          to clarify one comment...

          ...Especialy since this is a regression: I didn't have the facet fields listed twice on 4.8.1. Thanks.

          the root bug in how params were handled is far older then 4.8.1 ... the differences you're seeing may just be because of facet code changes after 4.8.1 that caused those code paths to be affected by this bug.

          Show
          Hoss Man added a comment - Ok, this should now be fixed properly on the 4_10 branch, and the CHANGES.txt should be correct moving forward. to clarify one comment... ...Especialy since this is a regression: I didn't have the facet fields listed twice on 4.8.1. Thanks. the root bug in how params were handled is far older then 4.8.1 ... the differences you're seeing may just be because of facet code changes after 4.8.1 that caused those code paths to be affected by this bug.
          Hide
          Marius Dumitru Florea added a comment -

          Hoss Man, thank you for handling this so quickly, and for the clarification. I'll wait for 4.10.4 now.

          Show
          Marius Dumitru Florea added a comment - Hoss Man , thank you for handling this so quickly, and for the clarification. I'll wait for 4.10.4 now.
          Hide
          Vincent Massol added a comment - - edited

          Thanks a lot Hoss Man! Any rough idea of when 4.10.4 will be released? I need this fix too (the last merge).

          Show
          Vincent Massol added a comment - - edited Thanks a lot Hoss Man ! Any rough idea of when 4.10.4 will be released? I need this fix too (the last merge).
          Hide
          Anshum Gupta added a comment -

          Bulk close after 5.0 release.

          Show
          Anshum Gupta added a comment - Bulk close after 5.0 release.

            People

            • Assignee:
              Hoss Man
              Reporter:
              Alexandre Rafalovitch
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development