As currently implemented, KahaDB can only reduce data file usage by deleting an entire data file. As a result, the situation where KahaDB can reduce the amount of space it uses on disk is when there are no old messages still in a data file; if there is even one message in an old file that must be kept, the entire file cannot be deleted. And if one (deleted) message in the old file has its deletion record in a later file, that later file must also be kept, even if none of the messages in it are actually needed otherwise; as a result, a single old message could keep alive a long chain of data files.
The current advice that's been given is 1) don't keep messages for very long, and 2) use small KahaDB files so that you'll be able to delete at least some portions of what would have been a single large file that had to stick around (and in the hopes that you'll get lucky and be able to break the chain of kept files). These are both workarounds (and not very good ones, particularly since the entire concept of a DLQ is fundamentally opposed to #1) for the fundamental flawed assumption in KahaDB: that it's reasonable for its files to be read-only and for the database itself to be powerless to do anything when files are sparsely populated by live messages. The fundamental paradigm of files being write-only for individual message deletion was a good one and provides excellent performance characteristics; however, restricting occasional maintenance tasks to the same paradigm handcuffs them unreasonably and should be changed.
The periodic cleanup task that already looks for files that are unused should be changed so that if it determines that it cannot delete the file because it contains at least one live message but it contains less than a configurable percentage of live messages, it will rewrite the journal file in question so it contains only those live messages into file, updating any in-memory indices that might show the offsets of messages within the file (if there are any such things). If any in-memory data structures will need to be updated, we need to appropriately synchronize to ensure that no one can use the portions of the data structure related to the file currently being compacted; access to similar information for all other data files can continue unrestricted.
Note that this will result in us still having potentially many individual files, with each one having a much smaller file size than our target size. If that is problematic, it would be possible to combine multiple partial files together during the compaction process (while respecting the max file size) instead of writing live messages back into their current file.