Resolution: Not A Problem
Affects Version/s: 1.2.0
Fix Version/s: None
Component/s: SCM HA
Release Note:Resolve it, since both retry count and delete transaction can be correctly handled. We may not be able to scan out new added transaction immediately, but it is not problem.
After SCM Snapshot is merged, writes to rocksDB are queued in DBTransactionBuffer, and committed to rocksDB when SCMDBTransactionBuffer#flush() is called in SCMStateMachine#takeSnapshot().
Assume the new added/deleted/updated DeletedBlocksTransaction are queued in DBTransactionBuffer, there will be two issues:
- increaseRetryCountOfTransactionInDB() will read a stale retryCount, and execute FAA(Fetch and Add) logic on a stale value.
- SCMBlockDeletingService 1) can not read the new added txn, 2) read the already deleted txn
We will enable PartialTableCache of DeletedBlocksTXTable. All the add/delete/update will update both PartialTableCache and DBTransactionBuffer, read operation will first check the cache, then the table if cache missed. The cache will be evicted when SCMDBTransactionBuffer#flush() is called.
The cache and table are modified by StateMachineUpdater, which applies committed operations in a sequential order, so the correctness is apparent.
For the above issue:
- increaseRetryCountOfTransactionInDB() is executed in StateMachineUpdater, it will always read the latest retryCount.
- SCMBlockDeletingService scans the table via an iterator, we need first return the txns in cache, then return the txns in table but not in cache. Attention, while SCMBlockDeletingService is scanning the cache and table, StateMachineUpdater might update the cache and table simultaneously, which is not a problem, SCMBlockDeletingService can safely ignore the update during the scan, since the scan is a periodical task.