Description
There are two kinds of tablet replica deletion: tombstone and delete. A tombstone tablet replica might never be deleted since the delete-type deletion could only occur when the tablet is deleted, and the requests will be sent to the voters, not including the tombstone ones.
Here is a example:
Tablet T:
replica A
replica B
replica C
After rebalance:
replica A
replica B
replica C(Tombstone)
replica D
When the tablet T is deleted, A B D are deleted, and C exists forever.
Like this picture, the tablet had already been deleted at 3:00 am 13th Jun, but the tombstone replica still exists.
The data of tombstone replica is deleted, but metadata is persisted in memory, especially the biggest one SchemaPB will occupy a lot of memory.
In some of our clusters, tombstone replicas of each tserver could reach 50k ~ 100k, which takes about 10G.
It takes too much resource if adds a vector for each tablet to store the history tablet servers that used to hold a replica of the tablet. So I think periodically heartbeat might be a good way to solve the problem.