Description
As a result of the analysis and reproduction of IGNITE-21142, it was found that the metastorage raft command can be re-sent if it does not time out, which may not be good and lead to hidden negative consequences, such as in IGNITE-21142.
Here we need to find out the reasons for this decision (with re-try by timeout) and understand what to do next. I think we should use an infinite timeout.
As a result of the analysis and reproduction of IGNITE-21142, it was found that the metastorage raft command can be re-sent if it does not time out, which may not be good and lead to hidden negative consequences, such as in IGNITE-21142.
Here we need to find out the reasons for this decision (with re-try by timeout) and understand what to do next. I think we should use an infinite timeout.
Upd#1
As discussed, it's required to detect whether InvokeCommand was already processed on a server and resend original response if true instead of reprocessing. First of all it's not only about invoke but also about multiInvoke. Worth mentioning though that it relates only to MS and maybe CMG but not Partitions: within partitions, tx protocol along with returning result from indexes instead of returning result from raft, protects us from non-idempotent command processing.
All in all following solution is expected to be implemented:
- New interface NonIdempotentCommand is introduced with an id field.
- All MS non-idempotent commands like InvokeCommand and MultiInvokeCommand implement aforementioned interface.
- On the client side, an identifier is added to the command. Two options are possible here:
- It's possible to set id to the the command on command creation. Easiest way, but it will required extra effort on the server side to track command time. In that case it's possible to use LongCounter + nodeId as an id.
- Or it's possible to adjust command with an id within retry loop, in that case we may use id as a "command time", of course, it also means that clock or System.currentTime<> should be used as id. I strongly believe that first option is better for now.
- On the server side, precisely, within MS state machine new nonIdempotentCommandCache is introduced commandId -> (commandResult, commandStartTime)
- On each NonIdempotentCommand following logic should be implemented:
- As an initial step it's required to check whether there's a command with given id in the cache, if true just return cached result, without command reprocessing.
- If there's no given command in the cache, process it and populate the cache with the result.
Basically that's all. Both cache persistence and recovery on group restart and cache cleanup will be covered within separate tickets.
Attachments
Issue Links
- blocks
-
IGNITE-22214 Meta storage idempotent invokes: persist and recovery idempotent command cache
- Resolved
- is blocked by
-
IGNITE-22113 Remove unused MetaStorageManagerImpl getAnd<> methods
- Resolved
- is caused by
-
IGNITE-21142 "SqlException: Table with name 'N' already exists" upon creating a non-existing table
- Resolved
- links to