Root causing performance issues usually comes down to skew in terms of work done across hosts. Un-even assignment of fragments or bytes scanned per host is a common.
Making this information readily available in the query profile will help speedup RCA.
Proposal is to add two tables to the query profile that cover
- Number of fragments per host
- Number of bytes scanned per host