I have discovered an issue with the KahaDB index recovery after an unclean shutdown (OOM error, kill -9, etc) that leads to excessive disk space usage.
Normally on clean shutdown the index stores the known set of free pages to db.free and reads that in on start up to know which pages can be re-used. On an unclean shutdown this is not written to disk so on start up the index is supposed to scan the page file to figure out all of the free pages.
Unfortunately it turns out that this scan of the page file is being done before the total page count value has been set so when the iterator is created it always thinks there are 0 pages to scan.
The end result is that every time an unclean shutdown occurs all known free pages are lost and no longer tracked. This of course means new free pages have to be allocated and all of the existing space is now lost which will lead to excessive index file growth over time.