Each HRI, unoptimized, is probably about 400 bytes. If we minimally binary encode it, we're probably talking closer to 100-150 bytes for all the information of a region. Add another 32 bytes of overhead from the object itself, and call it 200 bytes per region at the high end. I don't think historian belongs in ZK at all.
This is small, meta data. Several orders of magnitude smaller than 1M, and well below even 1K. These are not large objects.
I do believe this could be a big win for two reasons. One, HBase has no db-level replication so requests for a segment of META will always go to a single node (this is the reason we still use some key/val caching at streamy on top of HBase for the most commonly read rows). Zookeeper replicates the data across all nodes so reads are fully-distributed. Two, the code dealing with .META. is nasty and has always caused problems. Doing something like alternate row order (ascending, for example) would be rather easy if done in ZK vs how we do it now.
However, META as a special table in HBase does work now (and is not really a bottleneck yet)...
So I vote to bump further discussion of this to 0.22. I'd like to get to the next release ASAP and there is a beast of a problem to solve (with ZK help) in our current assignment/cluster task/load balancing systems before moving META to ZK, if ever. If there was actual load balancing in HBase that balanced read load, it would help with both META as well as normal tables and potentially remove almost all need for caching outside of HBase.