Details
-
Bug
-
Status: Open
-
Critical
-
Resolution: Unresolved
-
4.5.2
-
Security Level: Public (Anyone can view this level - this is the default.)
-
None
-
three management server and handreds of vmware compute nodes
Description
[appearance]
when many cocurrent disable static nat command happend in one network, some public ips may remnant in VR, and this will cause a big problem for the hole cloud netowrk
[reason]
when executing the disable static nat command, CS will execute disassociate ip command at the same time, and this command will put all the public ip, include the associate and disassociate ips, to VR, however, if cocurrent disable static nat commands is happening, like disable public ips A and B, first disable ip A, then disable ip B, this two commands will be like first A- B+, second A- B- (this place, we use - as disassociate ip, + as associate ip), but this two commands like above is working in a normal way, if the cocurrent time is very close, the answer of VR for disassociate ip A is not returned to CS, the public ip A will be remain in associated status in CS database, then the second command would not be A- B-, but A+ B-, then the ip A will be reassociated by the second command without our expectation, and this will make public ip A remnant in VR, so as the reason above, some ips which should be disassociated may be reassociated by the cocurrent other commands, this issue will happened easily as the close cocurrent disable static nat commands.
[bug fix suggestion]
use some kind of lock mechanism like "optimistic lock", this "optimistic lock" will give a version id for network and vpc in CS database, anytime the network or vpc is doing some about public ip or network rules (network rules also have this problem), the version id will have an increment, when the resouce part (like VmwareResource) find the command which it got is before or equal the last version they got before, this command will be discarded. This method guarantee that every command sent by resource part and rechieved by VR will be the last version of network or vpc at that time. so the example like above will not happen again.