Took a look at this. The main issue is that we seem to be tracking a bunch of metrics per request type and they all end up being tracked in the network layer. This has really made the network layer ugly. These metrics are really of two types Network level metrics and Request level metrics.
1. Metrics that can be tracked at the network level are those that do not have any large difference based on the request type. For example, queueTime is a metric that is supposed to be the amount of time a request spends in the request queue. Tracking this for each request type does not really make sense. This is a network level property and should be tracked at that level.
2. Metrics that can be tracked at the request level have a large difference based on the request type. For example, local time is the amount of time the KafkaApi handle method takes to complete. This largely depends on the request type and should be tracked at the request level.
To summarize, with the decoupling specified above we have the following metrics at the two levels
// time a request spent in a request queue
val queueTimeHist = newHistogram(name + "-QueueTimeMs")
// time to send the response to the requester from the response queue
val responseSendTimeHist = newHistogram(name + "-ResponseSendTimeMs")
// total time taken for the request to be served
val totalTimeHist = newHistogram(name + "-TotalTimeMs")
// request rate by type
val requestRate = newMeter(name + "-RequestsPerSec", "requests", TimeUnit.SECONDS)
// time a request takes to be processed at the local broker
val localTimeHist = newHistogram(name + "-LocalTimeMs")
// time a request takes to wait on remote brokers (only relevant to fetch and produce requests)
val remoteTimeHist = newHistogram(name + "-RemoteTimeMs")
With the separation specified above, any request can be defined as
queueTime + localTime + remoteTime + responseSendTime = totalTime
We can totally remove any kafkaapi dependency in the network layer with the proposed separation.