Description
(copying this from my multi-master design doc)
The inclusion or exclusion of a tablet in an incremental tablet report is edge-triggered, and may result in a state changing operation on the tserver, communicated via out-of-band RPC. This RPC is retried until it is successful. However, if the leader master dies after it is able to respond to the tserver's heartbeat but before the out-of-band RPC is sent, the edge-triggered tablet report may be missed, and the state changing operation will not be performed until the next time the tablet is included in a tablet report. As tablet report inclusion criteria is narrow, operations may be "missed" for quite some time.
These operations include:
- Some tablet deletions, such as tablets belonging to orphaned tables, or tablets whose deletion RPCs were sent and failed during an earlier DeleteTable() request.
- Some tablet alters, such as tablets whose alter RPCs were sent and failed during an earlier AlterTable() request.
- Config changes sent due to under-replicated tablets.
A simple fix is to require that tservers send a full tablet report when they detect that a new leader master was elected.