(In reply to comment #10)
> Hi Paul,
> I finally found time to look into your code in detail and I think
> it's really excellent work. Before committing it, I have a few questions.
> *) In your source files you have included a copyright statement referring
> to yourself. Of course you include the Apache License. However, I haven't
> other source files in Lucene with similar copyright statements. I don't know
> legal consequences of that. Maybe someone else on the list knows more. The
> simplest solution would be to substitute "Copyright 2004 Paul Elschot" with
> "Copyright 2004 The Apache Software Foundation". Would you agree?
The intention is to allow the Apache Software Foundation to take over the
copyright in case they want to.
As I understand the Apache Licence, taking over the copyright is
allowed by the licence. So I used my own copyright, and it could be changed
when taken over into an Apache project.
However, the relevant documentation
sais that contributed files should have the copyright
assigned to the Apache Software Foundation.
I'll try and do that the next time.
Could you change the copyright notices accordingly this time?
> *) BooleanScorer2 extends NrMatchersScorer and nrMatchers() always returns 1.
> Is there a reason for that? I think it should either only extend Scorer or
> deliver the correct values. I opt for extending Scorer only.
The reason is that a BooleanQuery can be scored by a few cooperating
(nested) scorers, and that it should still be possible to compute the
coordination factor from the number of matching scorers of the originally
By default nrMatchers() returns 1, and this is for the case when the scorer is
given to the BooleanScorer2 as a scorer of an added clause.
(At the moment these are wrapped in a NrMatchersScorer. )
The cooperating scorers implementing the boolean behaviour
add these numbers for their subscorers to make it work in the same way
as scoring a single BooleanQuery.
The idea is is to either sum nrMatchers(), or to use nrMatchers()
for the coordination factor in the score and return 1 for nrMatchers().
It might be worthwhile to add something like this in the javadocs.
> *) All NrMatchersScorers except for BooleanScorer2 and ConjunctionScorer
> use a similarity implementation. They compute raw scores and nrMatches.
> ConjunctionScorer is a hybrid. It uses coord-factors and is is used as
> NrMatchersScorer. This could lead to incorrect results with Similarity
> implementations other than DefaultSimilarity. A ConjunctionScorer used as
> NrMathesScorer should compute raw scores, if used as standard Scorer it
> should use coord-factors. How can we achieve this in an elegant way?
Your're right that ConjunctionScorer has a double role here:
it can be used as a full replacement for BooleanScorer when all clauses
are required, and it can also be used to score only the required
clauses combined with ReqOptScorer or ReqExclScorer for the other
The implementation could only fail when ConjunctionScorer
provides a nrMatchers bigger than 1, and computes the coordination
factor into it's score. The implementation prevents
this by using a top level scorer that always returns 1 for nrMatchers,
and uses nrMatchers() of it's subscorers for the coordination factor.
This is somewhat tricky, so I hope I got all the details right.
It also means that the changed ConjunctionScorer should not multiply
a coordination factor into its score() value. I don't remember
whether or not it does that, but it shouldn't.
One way to solve this would be to use another name for the changed
ConjunctionScorer, or to explicitly document that it should be
wrapped in a scorer that returns 1 for nrMatchers() when implementing
a full BooleanQuery.
In case nrMatchers() is added to Scorer, this wrapping would not
be necessary, and it should be documented that it is expected that
the scorers for the clauses implement their own coordination factor
into their score and return 1 for nrMatchers().
There may be a better way to implement this 'decoupling'
of the coordination factor from the cooperating scorers enterely within
BooleanScorer2, for example by maintaining the
number of matching subscorers in the top level scorer, invisible
from the outside, and having all the cooperating scorers maintain
this attribute of the top level scorer instead of their own.