> For highlighting, I've heard it argued both ways (i.e. prohibited terms can be important too).
Hmm can you give an example where it's useful to highlight the prohibited terms?
It wasn't my argument, but I guess it was along the lines that there can be info/relevance in the fact that the user did not want documents with a specific term, and thus it can make sense to highlight them (maybe with a diff color...)
> I wasn't thinking about highlighting as much as something like distributed IDF or other global term statistics.
But, normally, prohibited clauses don't contribute to scoring so the stats of terms inside them don't need to be distributed?
The key word there is "normally". As I said, it depends on the type of query in the prohibited clause, and the boolean query does not have the knowledge to know if it will matter or not. Something other than extractTerms could be used for distributed term stats though.