I think if we use WALEdit here, we should store the nonce and also the incremented value.
If the RS goes down after adding this WALEdit, we should be able to just replay the edit and use the value from this Edit. And i think before the client retries using the same nonce we should ensure if that RS to which the previous nonce was issued was down.
If this is the case the client's retry can be ignored.
Under current (and good) logic if edit went to the store, we will not see the WAL edit (again especially when the log replay is remote w/o recovered edits).
For standard nonces, it would mean that nonce would not be there on new server unless we widen WAL replay significantly and recover nonces for entries that are already in store.
For incremental nonces, we could do the same, or we could play it safe and assume everything before the lowest nonce we see at replay is already in store, but I wonder how well this will actually work.
It may happen that because successful operations take less than failed one, any client with more than one thread (or async APIs that may exist later) would be almost guaranteed to fail all such retries (because non-failing threads may have successes after the failing thread,
and any success with higher nonce will invalidate the lower nonce during recovery). Perhaps that approach is feasible still, at least it won't produce dups.
Why not just a UUID rather than all these gyrations? Or do you want to make it so that looking at id, you can tell what client it came from? It looks like you throw away all this info when you create the SecureRandom? Creating a SecureRandom for this one time use is expensive.
Interesting question... Java UUIDs are just random numbers. To keep reasonable size we could just generate secure random long then; do you think that would be sufficient?
I wanted the result to have deterministic and random part, so that for random number collision the additional condition would be either unrelated hash collision, or generating the same secure number at exact same IP, PID, TID and time
Client id should be long since in proto is uint64 in proto?
It is... the generated part is actually base for client id, renamed it.
Does ClientNonceManager have to be in top-level? Can it not be in client package and be made package private?
It has a constant that is used all over the place. Let me see if I can put it elsewhere...
Does it make sense putting clientid together w/ nonce making? Could you have a class that does noncemaking and then another to hold the clientid?
Yeah, that could be done... nonce-making class then would be just wrapper around atomiclong though
Is clientid tied to Connection?
It is static on connectionmanager. The only reason to not have less clientIds than any given number is if it makes it hard to control the incrementing number, so there's one per process.
Can you get connectionid? Or make a connectionid? Connections are keybqed by Configuration already? Would the Connection key do as a clientid?
I don't think so, they can be repeated (in fact probably will be) between restarts.
So, you decided to not pass nonce in here:
+ r = region.append(append, append.getWriteToWAL()/, clientId2, nonce/);
This is temporary, will be uncommented to write it into WAL.