When enabling the SQL On-heap Row Cache feature on a persistent, atomic, replicated cache, I found that after a number of queries and updates, averaging from 40 to 60 updates, the on-heap cache will become inconsistent with the off-heap storage. This manifests on a single, non-clustered Ignite node that I test with.
Specifically I would query a cache using SQL for a specific entry, but when updating the entry using a normal put() on the cache, the entry would not be changed from the perspective of the next SQL query. This causes the business code to not behave as expected.
When examining the state of the cache from DBeaver using a select query, I've found that the problem row in question is duplicated in the query results, and out of order despite ordering the results by key:
Restarting Ignite to clear the on-heap cache reveals the actual row:
When looking at the state of H2RowCache from a heap dump, I found that there where two different instances of GridH2KeyValueRowOnheap containing two different instances of the cache value in different states: the one I'm seeing and the one I'm trying to update it to.
As a side effect of all of this, the ModifyingEntryProcessor always fails on that row because "entryVal" is never equal to "val" when checked in the process() method.
I've attached a file I used to test the issue. That test revealed that it only occurs when both persistence and SQL on-heap cache are enabled. If one or the other is disabled then there is no issue.