Description
I was working on trying to address the performance regression on LUCENE-6815 but this is hard to do without introducing specialization of DisjunctionScorer which I'd like to avoid at all costs.
I think the performance regression would be easy to address without specialization if Scorers were changed to return an iterator instead of extending DocIdSetIterator. So conceptually the API would move from
class Scorer extends DocIdSetIterator { }
to
class Scorer {
DocIdSetIterator iterator();
}
This would help me because then if none of the sub clauses support two-phase iteration, DisjunctionScorer could directly return the approximation as an iterator instead of having to check if twoPhase == null at every iteration.
Such an approach could also help remove some method calls. For instance TermScorer.nextDoc calls PostingsEnum.nextDoc but with this change TermScorer.iterator() could return the PostingsEnum and TermScorer would not even appear in stack traces when scoring. I hacked a patch to see how much that would help and luceneutil seems to like the change:
TaskQPS baseline StdDev QPS patch StdDev Pct diff Fuzzy1 88.54 (15.7%) 86.73 (16.6%) -2.0% ( -29% - 35%) AndHighLow 698.98 (4.1%) 691.11 (5.1%) -1.1% ( -9% - 8%) Fuzzy2 26.47 (11.2%) 26.28 (10.3%) -0.7% ( -19% - 23%) MedSpanNear 141.03 (3.3%) 140.51 (3.2%) -0.4% ( -6% - 6%) HighPhrase 60.66 (2.6%) 60.48 (3.3%) -0.3% ( -5% - 5%) LowSpanNear 29.25 (2.4%) 29.21 (2.1%) -0.1% ( -4% - 4%) MedPhrase 28.32 (1.9%) 28.28 (2.0%) -0.1% ( -3% - 3%) LowPhrase 17.31 (2.1%) 17.29 (2.6%) -0.1% ( -4% - 4%) HighSloppyPhrase 10.93 (6.0%) 10.92 (6.0%) -0.1% ( -11% - 12%) MedSloppyPhrase 72.21 (2.2%) 72.27 (1.8%) 0.1% ( -3% - 4%) Respell 57.35 (3.2%) 57.41 (3.4%) 0.1% ( -6% - 6%) HighSpanNear 26.71 (3.0%) 26.75 (2.5%) 0.1% ( -5% - 5%) OrNotHighLow 803.46 (3.4%) 807.03 (4.2%) 0.4% ( -6% - 8%) LowSloppyPhrase 88.02 (3.4%) 88.77 (2.5%) 0.8% ( -4% - 7%) OrNotHighMed 200.45 (2.7%) 203.83 (2.5%) 1.7% ( -3% - 7%) OrHighHigh 38.98 (7.9%) 40.30 (6.6%) 3.4% ( -10% - 19%) HighTerm 92.53 (5.3%) 95.94 (5.8%) 3.7% ( -7% - 15%) OrHighMed 53.80 (7.7%) 55.79 (6.6%) 3.7% ( -9% - 19%) AndHighMed 266.69 (1.7%) 277.15 (2.5%) 3.9% ( 0% - 8%) Prefix3 44.68 (5.4%) 46.60 (7.0%) 4.3% ( -7% - 17%) MedTerm 261.52 (4.9%) 273.52 (5.4%) 4.6% ( -5% - 15%) Wildcard 42.39 (6.1%) 44.35 (7.8%) 4.6% ( -8% - 19%) IntNRQ 10.46 (7.0%) 10.99 (9.5%) 5.0% ( -10% - 23%) OrNotHighHigh 67.15 (4.6%) 70.65 (4.5%) 5.2% ( -3% - 15%) OrHighNotHigh 43.07 (5.1%) 45.36 (5.4%) 5.3% ( -4% - 16%) OrHighLow 64.19 (6.4%) 67.72 (5.5%) 5.5% ( -6% - 18%) AndHighHigh 64.17 (2.3%) 67.87 (2.1%) 5.8% ( 1% - 10%) LowTerm 642.94 (10.9%) 681.48 (8.5%) 6.0% ( -12% - 28%) OrHighNotMed 12.68 (6.9%) 13.51 (6.6%) 6.5% ( -6% - 21%) OrHighNotLow 54.69 (6.8%) 58.25 (7.0%) 6.5% ( -6% - 21%)
Attachments
Attachments
Issue Links
- is related to
-
SOLR-14164 Replace Solr's FunctionRangeQuery with Lucene's
-
- Patch Available
-