Ok, i updated the patch for today's trunk and it actually works now with ExactStatsCache. We now have correct DF for distributed queries.
I removed the perReaderTermContext in ExactStatsCache, this cached the TermContext for new terms. This was a problem because caching it this way meant that any second term got the same DF as the first.
I also added a local boolean to SolrIndexSearcher's collectionStatistics() and termStatistics() to force it to return only local scores. This is a nasty hack to prevent it from returning the other shard's DF. Without this, DF will increase for every other request, in the end it will crash the systems because the number gets too high.
Also, the warning ## Missing global termStats info: " + term + ", using local should perhaps not be a warning at all. This gets emitted also for fields not having those terms. The check in returnLocalStats doesn't add terms for docFreq == 0.
Add <globalStats class="org.apache.solr.search.stats.ExactStatsCache"/> to your solrconfig in the config section to make it work.
Please check my patch and let's fix this issue so we hopefully can get distributed IDF in Solr 4.7.