LUCENE-8688 it was introduce a new storing strategy for leaves contains duplicated points. In such case the points are stored together with the cardinality. We still call the IntersectVisitor once per document therefore we are checking many times the same point agains the query. The idea is to check the point once and then add all the documents.
The API of the IntersectVisitor does not allow that, and therefore to exploit that property we need to either change the API or extend it. Here are the possibilities I can think of:
1) Modify the API by replacing the method IntersectVisitor#visit(byte, int) by the following method:
This will allow the BKD reader to check if a point matches the query and if true then Coll the method IntersectVisitor#visit(int) for all documents associated with that point.
The drawback of this approach is backwards compatibility and the need to update all classes implement this interface.
2) Extends the API by adding a new default method in the IntersectVisitor interface:
The merit of this approach is that is backwards compatible and it is up to the implementors to override this method and get the benefits for this optimisation.The biggest downside is that it assumes that the codec has doc IDs available in an int slice as opposed to streaming them from disk directly to the IntersectVisitor for instance as Adrien Grand noted.
Maybe there are more options I did not think about so looking forward to hearing opining if we should do this change at all and if so, how to approach it. My +1 goes to 1).