Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
Kafka Streams has the config `windowstore.changelog.additional.retention.ms` to allow for an increase retention time.
While an increase retention time can be useful, it can also lead to unnecessary restore cost, especially for stream-stream joins. Assume a stream-stream join with 1h window size and a grace period of 1h. For this case, we only need 2h of data to restore. If we lag, the `windowstore.changelog.additional.retention.ms` helps to prevent the broker from truncating data too early. However, if we don't lag and we need to restore, we restore everything from the changelog.
Instead of doing a seek-to-beginning, we could use the timestamp index to seek the first offset older than the 2h "window" of data that we need to restore, to avoid unnecessary work.
Attachments
Attachments
Issue Links
- relates to
-
KAFKA-7934 Optimize restore for windowed and session stores
- Open