Description
Last weekend many TS have a lot too many open files error(haven't upgrade to , when using our internal deploy tool to restart cluster (stop all ts, then start all ts), the control machine have some issue which seems to block or write to ssh terminal(maybe usb driver issue, not related to this bug), so only half (about 30) of the TS is shutdown, then after maybe 10 minutes, I switch to another control host and perform the whole restart.
Then I see writes are blocked, because 1 tablet is in no leader state, from web-ui, 2 of 3 replicas is in follower state, 1 TABLET_DATA_TOMBSTONED, but all election failed, will attach the log of the 2 followers.