The existing profiles in KrpcDataStreamRecvr and KrpcDataStreamSender made it hard to diagnose slow queries shown in
IMPALA-6657. In particular, there are times in which the profile of the receiver showing a lot of time waiting for row batches to arrive while the sender is also showing a lot of time waiting for responses of TransmitData() RPC.
A couple of improvements can be done to make it slightly easier to diagnose the problem:
- track the number of deferred row batches over time in KrpcDataStreamRecvr
- track the number of bytes dequeued over time in KrpcDataStreamRecvr
- track the amount of time row batches spent in deferred queue
- track the number of bytes sent from KrpcDataStreamSender over time
The above items help identify cases in which one fragment instances containing an exchange node is slow for a period of time (e.g. the parent of exchange node spills heavily), causing all senders to that fragment instance to block waiting for responses. As all senders are blocked waiting for previous RPC to complete, they will not produce more rows and all other fragment instances will be starved, leading to the high wait time shown in their receiver's profile. The time series counter for the number of deferred row batches in a receiver helps identify cases described above.