The major issue is that Lucene now creates scorers per-segment, and if you use Lucene's searcher.search(...,sort) then the FieldCache populations will also be per-segment.
The biggest issue: If FieldCache get's populated at both the top-level reader and per-segment, memory usage doubles (as does un-inversion time).
- Faceting on single-valued fields uses the FieldCache at the top-level (and would be
- This is non-trivial to change... if we started counting per-segment, counts would somehow have to be merged across segments.
- Sorting in Solr currently uses the FieldCache at the top level
- This can't easily be changed to use Lucene's searcher.search(...,sort) since we are using a hit collector (which can be wrapped in a time limited collector).
- Distributed search uses the top-level FieldCache to retrieve sort field values.
- FunctionQuery now derives values at the segment level
- This also applies to the function range query
Another issue for function query is the use of ord()... it won't be valid in multi-segment indexes if evaluated at the segment level.
Evaluate custom sorters (like query elevation, etc) to ensure that they still work at the segment level. Solr doesn't currently do segment-level sorting like Lucene now does, but perhaps we should switch for more near-real-time support.