Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
0.10.0.1
-
None
-
None
Description
The basic idea of batch expiration is that we don't expire batches when producer thinks "it can make progress". Currently the notion of "making progress" involves only in-flight requests (muted partitions). That's not sufficient. The other half of the "making progress" is that if we have stale metadata, we cannot trust it and therefore can't say we can't make progress. Therefore, we don't expire batched when metadata is stale. This also implies we don't want to expire batches when we can still make progress even if the batch remains in the queue longer than the batch expiration time.
The current condition in abortExpiredBatches that bypasses muted partitions is necessary but not sufficient. It should additionally restrict ejection when metadata is stale.
Conversely, it should expire batches only when the following is true
- !muted AND
- meta-data is fresh AND
- batch remained in the queue longer than request timeout.
Attachments
Issue Links
- links to