Solr
  1. Solr
  2. SOLR-7335

Multivalue field that is boosted on indexing time has wrong norm.

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 4.10, 5.0, 5.1
    • Fix Version/s: 5.2, 6.0
    • Component/s: None
    • Labels:
      None

      Description

      Multivalue field has wrong norm when the field value is tokenized, the field or document is boosted, and the field is not source of copyField.

      $ java -jar start.jar &
      
      $ echo '{
      "add": {
        "doc": {
          "id":"no-boosted",
          "features": ["a","b","c"],
          "dyn_not_copied_txt": ["a","b","c"]
        }
      },
      "add": {
        "boost": 10,
        "doc": {
          "id":"boosted",
          "features": ["a","b","c"],
          "dyn_not_copied_txt": ["a","b","c"]
        }
      }}' > test.json
      
      $ curl 'http://localhost:8983/solr/update/json?commit=true' -H 'Content-type:application/json' --data-binary @test.json
      {"responseHeader":{"status":0,"QTime":41}}
      
      $ curl 'http://localhost:8983/solr/select' -d 'omitHeader=true&wt=json&indent=on&q=*:*&fl=id,norm(features),norm(dyn_not_copied_txt)'
      {
        "response":{"numFound":2,"start":0,"docs":[
            {
              "id":"no-boosted",
              "norm(features)":0.5,
              "norm(dyn_not_copied_txt)":0.5},
            {
              "id":"boosted",
              "norm(features)":5.0,
              "norm(dyn_not_copied_txt)":512.0}]
        }}
      

      In the above example, "features" is source of copyField. On the other hand, "dyn_not_copied_txt" is not so.

      "features" and "dyn_not_copied_txt" have the same type attribute (type="text_general"), the same values ( ["a","b","c"] ) and the same boost. So, both fields must have the same norm in the document.

      But, in boosted document only, the field that is not copied have too larger norm.

      1. SOLR-7335.patch
        4 kB
        Shingo Sasaki

        Issue Links

          Activity

          Hide
          Shingo Sasaki added a comment - - edited

          "org.apache.solr.update.DocumentBuilder" computes wrong boost value.

          By SOLR-6259's patch, variable "fieldBoost" and "compoundBoost" are not initialized with 1.0f when multivalue field has no copyFields. Attached patch to fix this bug.

          Show
          Shingo Sasaki added a comment - - edited "org.apache.solr.update.DocumentBuilder" computes wrong boost value. By SOLR-6259 's patch, variable "fieldBoost" and "compoundBoost" are not initialized with 1.0f when multivalue field has no copyFields. Attached patch to fix this bug.
          Hide
          Hoss Man added a comment -

          Bug looks terrible, patch looks good.

          Running tests now, and then i'm going to start committing & backporting to 5x and 5.2)

          Show
          Hoss Man added a comment - Bug looks terrible, patch looks good. Running tests now, and then i'm going to start committing & backporting to 5x and 5.2)
          Hide
          ASF subversion and git services added a comment -

          Commit 1681249 from hossman@apache.org in branch 'dev/trunk'
          [ https://svn.apache.org/r1681249 ]

          SOLR-7335: Fix doc boosts to no longer be multiplied in each field value in multivalued fields that are not used in copyFields

          Show
          ASF subversion and git services added a comment - Commit 1681249 from hossman@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1681249 ] SOLR-7335 : Fix doc boosts to no longer be multiplied in each field value in multivalued fields that are not used in copyFields
          Hide
          ASF subversion and git services added a comment -

          Commit 1681253 from hossman@apache.org in branch 'dev/branches/branch_5x'
          [ https://svn.apache.org/r1681253 ]

          SOLR-7335: Fix doc boosts to no longer be multiplied in each field value in multivalued fields that are not used in copyFields (merge r1681249)

          Show
          ASF subversion and git services added a comment - Commit 1681253 from hossman@apache.org in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1681253 ] SOLR-7335 : Fix doc boosts to no longer be multiplied in each field value in multivalued fields that are not used in copyFields (merge r1681249)
          Hide
          Shingo Sasaki added a comment -

          Thanks for the commit!

          Show
          Shingo Sasaki added a comment - Thanks for the commit!
          Hide
          ASF subversion and git services added a comment -

          Commit 1681260 from hossman@apache.org in branch 'dev/branches/lucene_solr_5_2'
          [ https://svn.apache.org/r1681260 ]

          SOLR-7335: Fix doc boosts to no longer be multiplied in each field value in multivalued fields that are not used in copyFields (merge r1681249)

          Show
          ASF subversion and git services added a comment - Commit 1681260 from hossman@apache.org in branch 'dev/branches/lucene_solr_5_2' [ https://svn.apache.org/r1681260 ] SOLR-7335 : Fix doc boosts to no longer be multiplied in each field value in multivalued fields that are not used in copyFields (merge r1681249)
          Hide
          Hoss Man added a comment -

          thanks for the patch!

          Show
          Hoss Man added a comment - thanks for the patch!
          Hide
          Anshum Gupta added a comment -

          Bulk close for 5.2.0.

          Show
          Anshum Gupta added a comment - Bulk close for 5.2.0.

            People

            • Assignee:
              Hoss Man
              Reporter:
              Shingo Sasaki
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development