I looked at the RuntimeProfile::ToThrift() part of the dop16 profile from the parent JIRA - https://issues.apache.org/jira/secure/attachment/12993392/coord_q5_dop16.svg.
There are some minor inefficiencies within ToThrift(), e.g. constructing vectors then copying them immediately.
The majority of time, though, is spent in TRuntimeProfileNode() and ~TRuntimeProfileNode(). The only place those would be invoked from ToThrift() is in this line at https://github.com/apache/impala/blob/f2f348c0f93208a0f34c33b6a4dc82f4d9d4b290/be/src/util/runtime-profile.cc#L1174:
I scratched my head and started at our code a bit, then went and looked at the std::vector and thrift-generated code. I believe this line in std::vector is the problem https://github.com/gcc-mirror/gcc/blob/releases/gcc-4.9.2/libstdc%2B%2B-v3/include/bits/vector.tcc#L421
It can't use the move constructor of the TRuntimeProfileNode, because of some c++ exception-safety guarantees and the fact that the constructor is not marked noexcept. I don't think there's an easy way to avoid this without changing the code generated by thrift.
I played around with the compiler explorer and it looks like it generates much better code with the noexcept added: https://godbolt.org/z/ZTSMHY