Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-13838

igain query parser generating invalid output

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 8.2
    • None
    • query parsers
    • None
    • The issue is a generic Java defect and therefore will be independent of the operating system or software platform.

    Description

      Investigating the output from the "features()" stream source, terms are being returned with NaN for the score_f field:

      {{    "docs": [}}
      {{      {}}
      {{        "featureSet_s": "business",}}
      {{        "score_f": "NaN",}}
      {{        "term_s": "1,011.15",}}
      {{        "idf_d": "-Infinity",}}
      {{        "index_i": 1,}}
      {{        "id": "business_1"}}
      {{      },}}
      {{      {}}
      {{        "featureSet_s": "business",}}
      {{        "score_f": "NaN",}}
      {{        "term_s": "10.3m",}}
      {{        "idf_d": "-Infinity",}}
      {{        "index_i": 2,}}
      {{        "id": "business_2"}}
      {{      },}}
      {{      {}}
      {{        "featureSet_s": "business",}}
      {{        "score_f": "NaN",}}
      {{        "term_s": "01",}}
      {{        "idf_d": "-Infinity",}}
      {{        "index_i": 3,}}
      {{        "id": "business_3"}}
      {{      },...}}

      Looking into{{ org/apache/solr/search/IGainTermsQParserPlugin.java}}, it seems that when a term is not included in the positive or negative documents, the docFreq calculation (docFreq = xc + nc) is 0, which means that subsequent calculations result in NaN (division by 0).

      Attached is a patch which skips terms for which docFreq
      is 0 in the finish() method of IGainTermsQParserPlugin and this resolves the issues with NaN scores in the features() output.

      Attachments

        Activity

          People

            Unassigned Unassigned
            convsolns Peter Davie
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: