TCMalloc's PageHeap::AllocLarge() has O( n ) behavior. As the heap gets fragmented, this O( n ) behavior can lead to contention, because the thread executing PageHeap::AllocLarge() is holding a lock. In recent versions of gperftools, this code has been modified to have O(log n) behavior. This could reduce contention significantly in some cases.
We can get this fix by using a more modern version of gperftools (see https://issues.apache.org/jira/browse/IMPALA-6784 ). However, the patches for the O(log n) behavior are fairly contained. Here are the two patches needed:
These would be easy to port to gperftools-2.5. This Jira tracks that effort (which is separate and would be superseded if we upgrade gperftools).