This bug is caused by the improvements made in https://issues.apache.org/jira/browse/KAFKA-10847, which fixes an issue with stream-stream left/outer joins. The issue is only caused when a stream-stream left/outer join is used with the new `JoinWindows.ofTimeDifferenceAndGrace()` API that specifies the window time + grace period. This new API was added in AK 3.0. No previous users are affected.
The issue causes that the internal changelog topic used by the new OUTERSHARED window store keeps growing unbounded as new records come. The topic is never cleaned up nor compacted even if tombstones are written to delete the joined and/or expired records from the window store. The problem is caused by a parameter required in the window store to retain duplicates. This config causes that tombstones records have a new sequence ID as part of the key ID in the changelog making those keys unique. Thus causing the cleanup policy not working.
In 3.0, we deprecated JoinWindows.of(size) in favor of JoinWindows.ofTimeDifferenceAndGrace() -- the old API uses the old semantics and is thus not affected while the new API enable the new semantics; the problem is that we deprecated the old API and thus tell users that they should switch to the new broken API.
We have two ways forward:
- Fix the bug (non trivial)
- Un-deprecate the old JoinWindow.of(size) API (and tell users not to use the new but broken API)