Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
1.0.0-incubating, 1.1.0
-
None
Description
This can occur for persistent async event queues and for regions when concurrency checks are disabled.
Currently->
1.) When rolling a crf we create a krf that is based on the current “live” region
2.) If removes are being done at the same time, the krf will reflect the current state, where the keys are not part of the krf file
3.) Due to the async disk write, the drf has yet to be updated.
4.) If the cluster gets shut down before the drf is written to. This can lead to the following scenarios:
- (No issue) the user recovers with the existing krf/drf/crf files. This works just fine as the krf has reflected the change
- (Problem!) If the user compacts and then recovers, the removed entries are now resurrected and appear in the region due to the way compaction operates. It ignores the krf and works on a the existing crf/drf files. Because the drf does not reflect removed events, the events are rolled forward from the crf.
Attached is a set oplogs for a single node prior to compaction and a cache.xml (need to fill in the correct location for the diskstore directory)
Recovering from this set of oplogs recovers 0 entries for the async event queues.
If you run offline compaction on the oplogs and recover, there are now entries in the async event queues.