RuntimeProfile trees can potentially stress the memory allocator and use up a lot more memory and cache than is really necessary:
- std::map is used throughout, and allocates a node per map entry. We do depend on the counters being displayed in-order, but we would probably be better of storing the counters in a vector and lazily sorting when needed (since the set of counters is generally static after Prepare()).
- We store the same counter names redundantly all over the place. We'd probably be best off using a pool of constant counter names (we could just require registering them upfront).
There may be a small gain from switching thrift to using unordered_map, e.g. for the info strings that appear with some frequency in profiles.
However, I think we need to restructure the thrift representation and in-memory representation to get significant gains.