Solr
  1. Solr
  2. SOLR-4877

SolrIndexSearcher#getDocSetNC should check for null return in AtomicReader#fields()

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 4.2, 4.3
    • Fix Version/s: 4.3.1, 4.4, 6.0
    • Component/s: None
    • Labels:
      None

      Description

      In LUCENE-5023 it was reported that composite reader contexts should not contain null fields() readers. But this is wrong, as a null-fields() reader may contain documents, just no fields.

      fields() and terms() is documented to return null, so DocSets should check for null (like all queries do in Lucene). It seems that DocSetNC does not correctly check for null.

      1. SOLR-4877.patch
        0.7 kB
        Uwe Schindler
      2. SOLR-4877-nospecialcase.patch
        2 kB
        Uwe Schindler

        Issue Links

          Activity

          Hide
          Uwe Schindler added a comment -

          Simple patch for getDocSetNC().

          I am not sure why there is this special handling of TermQuery, in my opinion, the if colause should go away and Solr should call the default TermScorer#collect(). There is nbo reason to have a special handling here (and no performance impact as TermScorer#collect is implemented the same way).

          We should remove the special case here, any comments?!

          Show
          Uwe Schindler added a comment - Simple patch for getDocSetNC(). I am not sure why there is this special handling of TermQuery, in my opinion, the if colause should go away and Solr should call the default TermScorer#collect(). There is nbo reason to have a special handling here (and no performance impact as TermScorer#collect is implemented the same way). We should remove the special case here, any comments?!
          Hide
          Uwe Schindler added a comment -

          Does anybody has an idea, how to test this bug?

          Show
          Uwe Schindler added a comment - Does anybody has an idea, how to test this bug?
          Hide
          Uwe Schindler added a comment - - edited

          Patch removing the special case for TermQuery.

          The reson why this was here may be from older times: In Lucene 2.x, the collect method calculated the score of the Query, too. But since 3.x this is no longer the case, so there is no speed improvement by short-circuiting TermQuery. The original TermScorer/Collector code is identical.

          Show
          Uwe Schindler added a comment - - edited Patch removing the special case for TermQuery. The reson why this was here may be from older times: In Lucene 2.x, the collect method calculated the score of the Query, too. But since 3.x this is no longer the case, so there is no speed improvement by short-circuiting TermQuery. The original TermScorer/Collector code is identical.
          Hide
          Uwe Schindler added a comment -

          I will commit the "nospecialcase" patch if nobody objects.

          Show
          Uwe Schindler added a comment - I will commit the "nospecialcase" patch if nobody objects.
          Hide
          Robert Muir added a comment -

          +1

          Show
          Robert Muir added a comment - +1
          Hide
          Shalin Shekhar Mangar added a comment -

          Bulk close after 4.3.1 release

          Show
          Shalin Shekhar Mangar added a comment - Bulk close after 4.3.1 release
          Hide
          Feihong Huang added a comment -

          hi, Uwe,
          I think the special handling of TermQuery just because of performance impact.
          If we use the default search function in lucene such as "super.search(query,null,collector)",
          it will create TermWeight for TermQuery to calculate queryNorm(q) and idf(t)2 * t.getBoost() and so on.

          Therefore, i think the special handling of TermQuery is useful. Any comments?

          Show
          Feihong Huang added a comment - hi, Uwe, I think the special handling of TermQuery just because of performance impact. If we use the default search function in lucene such as "super.search(query,null,collector)", it will create TermWeight for TermQuery to calculate queryNorm(q) and idf(t)2 * t.getBoost() and so on. Therefore, i think the special handling of TermQuery is useful. Any comments?
          Hide
          Uwe Schindler added a comment -

          Hi the calculation is not a problem at all, it is done one time in TermWeight. The statistics for the term have to be fetched in any case (as it is part of the term dictionary). If this would be a problem here, a lot of queries in Lucene like MultiTermQuery (NumericRangeQuery, TermRangeQuery, FuzzyQuery) would need to be optimized, too, because they don't calculate scores. The reason why this special case was here is caused by older Lucene versions where the actual score per document was calculated by TermScorer although not requested. This was the performance impact, not the term statistics.

          If you don't have a benchmark showing that fetching the term statistics for this case is affecting performance I would prefer the simplier code.

          Show
          Uwe Schindler added a comment - Hi the calculation is not a problem at all, it is done one time in TermWeight. The statistics for the term have to be fetched in any case (as it is part of the term dictionary). If this would be a problem here, a lot of queries in Lucene like MultiTermQuery (NumericRangeQuery, TermRangeQuery, FuzzyQuery) would need to be optimized, too, because they don't calculate scores. The reason why this special case was here is caused by older Lucene versions where the actual score per document was calculated by TermScorer although not requested. This was the performance impact, not the term statistics. If you don't have a benchmark showing that fetching the term statistics for this case is affecting performance I would prefer the simplier code.
          Hide
          Feihong Huang added a comment -

          Thanks for comments and it make sense.

          Show
          Feihong Huang added a comment - Thanks for comments and it make sense.

            People

            • Assignee:
              Uwe Schindler
              Reporter:
              Uwe Schindler
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development