Ok, this took rather longer than expected... initially I tried to make stat fetching part of partition pruning, this can be added as an extra optimization if necessary as this requires too many API changes all over the place.
The alternative is simple, getting stat calls are all batched. New APIs on thrift use req/resp pattern; requests contain db, table, column list, and partition list (for partitions). The request returns whatever it can find (rather than the full list with some nulls, like the old APIs that built lists using individual calls to metastore). The code then uses this.
On metastore there's both JDO and SQL path for speed.
Also, cleaned up some stuff in StatOptimizer and StatsUtil that was generally suboptimal.