When we changed quota communication with KIP-219, fetch requests get throttled by returning empty response with the delay in `throttle_time_ms` and Kafka consumer retrying again after the delay.
With default configs, the maximum fetch size could be as big as 50MB (or 10MB per partition). The default broker config (1-second window, 10 full windows of tracked bandwidth/thread utilization usage) means that < 5MB/s consumer quota (per broker) may stop fetch request from ever being successful.
Or the other way around: 1 MB/s consumer quota (per broker) means that any fetch request that gets >= 10MB of data (10 seconds * 1MB/second) in the response will never get through. From consumer point of view, the behavior will be: Consumer will get an empty response with throttle_time_ms > 0, Kafka consumer will wait for throttle time delay and then send fetch request again, the fetch response is still too big so broker sends another empty response with throttle time, and so on in never ending loop
Return less data in fetch response in this case: Cap `fetchMaxBytes` passed to replicaManager.fetchMessages() from KafkaApis.handleFetchRequest() to <tracking window> * <consume bandwidth quota>. In the example of default configs and 1MB/s consumer bandwidth quota, fetchMaxBytes will be 10MB.