Affects Version/s: None
Fix Version/s: None
Just spent some time wrapping my head around the inner workings of compaction and tombstoning, with a view to providing guarantees for deleting previous values of tombstoned keys from kafka within a desired time.
There's a couple of good posts that touch on this:
Some existing controls:
log.cleaner.min.cleanable.ratio - ratio of duplicates in a log before it will be considered for compaction
min.cleanable.dirty.ratio - topic level override for the above
min.compaction.lag.ms - minimum time a record will exist (eg: to ensure it can be consumed before being compacted)
delete.retention.ms (how long a tombstone record is kept before it may be compacted away. ie so downstream consumers can be given time to see it).
segment.ms - maximum time before a new segment is rolled (compaction only happens on inactive segments)
segment.bytes - the size of the segment. (compaction only happens on inactive segments)
log.cleaner.io.max.bytes.per.second - global setting limiting IO of the log cleaner thread
Currently the controls have focused around guaranteeing a minimum time records and delete records will exist before they /may/ be compacted. But if you want to guarantee they will be compacted by a specific time, there is no control.
To achieve this now log.cleaner.min.cleanable.ratio or min.cleanable.dirty.ratio is hijacked to force aggressive compaction (by setting it to 0, or 0.000000001 depending on what you read), and along with segment.ms can provide timing guarantees that a tombstone will result in any other values for the key will be deleted within a desired time, /if/ a new record comes in to trigger a new segment roll.
But that sacrifices the utility of min.cleanable.dirty.ratio (and to a lesser extent, control over segment sizes). On any duplicate key and a new segment roll it will run compaction, when otherwise it might be preferrable to allow a more generous dirty.ratio in the case of plain old duplicates.
It would be useful to have control over triggering a compaction without losing the utility of the dirty.ratio setting. The pure need here is to specify a minimum time for the log cleaner to run (or a maximum time where it doesn't run!) on a topic that has keys replaced by a tombstone message that are past the minimum retention times provided by min.compaction.lag.ms
Something like a log.cleaner.max.delay.ms, a topic max.cleanable.delay.ms (or max.compaction.lag.ms ?) and an API to trigger compaction, with some nuances to be fleshed out.
In the mean time, this can be worked around with some duct tape:
- make sure any values you want deleted by a tombstone have passed min retention configs
- set global log.cleaner.io.max.bytes.per.second to what you want for the compaction task
- set topic min.cleanable.dirty.ratio=0 for the topic
- set a small segment.ms
- wait for a new segment to roll (ms + a message coming in) and wait for compaction to kick in. GDPR met!
- undo the hacks
Another workaround is to set min.compaction.lag.ms to say 31 days, along with dirty.ratio=0, and then every 30 days set it to 0 until compaction happens. That loses the normal dirty.ratio value the above workaround gives though.