hack patch that computes the heuristic up front in weight init, so it scores all segments consistently and returns the proper scoresDocsOutOfOrder for BS1.
Uwe's new test (the nestedFilterQuery) doesnt pass yet, don't know why.
Very easy to explain: Because it's a hack! The problem is simple: The new test explicitely checks that acceptDocs are correctly handled by the query, which is not the case for your modifications. In createWeight you get the first segemnt and create the filter's docidset on it, passing liveDocs (because you have nothing else). You cache this first DocIdSet (to not need to execute getDocIdSet for the first filter 2 times) and by that miss the real acceptDocs (which are != liveDocs in this test). The firts segment therefore returns more documents that it should.
Alltogether, the hack is of course uncommitable and the source of outr problem only lies in the out of order setting. The fix in your patch is fine, but too much. The scoresDocsOutOfOrder method should simply return, what the inner weight returns, because it may return docs out of order. It can still retun them in order (if a filter needs to be applied using iterator). This is not different to behaviour before. So the fix is easy: Do the same like in ConstantScoreQuery, where we return the setting from the inner weight.
Being consistent in selecting scorer implementations between segments is not an issue of this special case, it's a general problem and cannot be solved by a hack. The selection of Scorer for BooleanQuery can be different even without FilteredQuery, as BooleanWeight might return different different scorer, too (so the problem is BooleanScorer that does selection of its Scorer per-segment). To fix this, BooleanWeight must do all the scorer descisions in it's ctor, so we would need to pass also scoreInOrder and other parameters to the Weight's ctor.
Please remove the hack, and only correctly implement scoresDocsOutOfOrder (which is the reason for the problem, as it suddenly returns documents in a different order). We can still get the documents with that patch in different order if we have random access enabled together with the filter but the old IndexSearcher used DocIdSetIterator (in-order). We should ignore those differences in document order, if score is identical (and Mike's output shows scores are equal). If we want to check that the results are identical, the benchmark test must explicitely request docs-in-order on trunk vs. patch to be consistent. But then it's no longer a benchmark.
Conclusion: In general we explained the differences between the patches and I think, my original patch is fine except the Weight.scoresDocsOutOfOrder, which should return the inner Weight's setting (like CSQ does) - no magic needed. Our patch does not return wrong documents, just the order of equal-scoring documents is different, which is perfectly fine.