LIRS maintains metadata for some tombstone entries that have been evicted from the cache (but might be seen again). To limit the memory used for these entries, the lirs_tombstone_multiple limits the total number of tombstone entries. The enforcement walks through the recency list from oldest to newest and removes tombstone entries until it is back underneath the limit. This requires it to go past a certain number of non-tombstone entries before it reaches a tombstone entry. There are pathological cases where this enforcement needs to walk past a very large number of entries before reaching a tombstone entry.
Suppose that the cache never accesses the same entry more than once. Starting from empty, the first entries representing 95% of the capacity are automatically considered protected. The subsequent accesses are all unprotected. In order to dislodge a protected entry, an entry needs to accessed more than once. If every entry is unique, the protected entries are never touched again and form a contiguous block as the oldest entries on the recency list. Tombstone entries are above them, and unprotected elements are the newest entries on the recency list. It looks like this:
Protected entries (representing 95% of cache capacity)
Unprotected entries (representing 5% of cache capacity)
To enforce the tombstone limit, it would need to pass all the protected entries to reach a single tombstone entry. I modified cache-bench to add a case with UNIFORM distribution but a 500x ratio of entries to the cache size. This shows pathological performance compared to the 3x ratio:
We should rework the enforcement of the tombstone limit to avoid walking past all those entries. One option is to keep the tombstone entries on their own list.
Note that without the tombstone limit, this pathological case would use an unbounded amount of memory (because the tombstone entries would never be reach the bottom of the recency list and get removed).