Details
-
Bug
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
1.7.0
-
None
Description
The RaftConsensusNonVoterITest::RestartClusterWithNonVoter scenario of raft_consensus_nonvoter-itest creates a situation when one tablet server is shutdown and the catalog manager endlessly retries to delete a tablet from that server, outputting huge amount of messages like below:
W0218 11:09:50.375723 26207 catalog_manager.cc:2648] Async tablet task 83456f56fde245c29db20ffa967275f1 Delete Tablet RPC for TS=293703ef5e1840e39f9b9a260ddae65b failed: Not found: failed to reset TS proxy: Could not find TS for UUID 293703ef5e1840e39f9b9a260ddae65b I0218 11:09:50.404064 26207 catalog_manager.cc:2629] Scheduling retry of 83456f56fde245c29db20ffa967275f1 Delete Tablet RPC for TS=293703ef5e1840e39f9b9a260ddae65b with a delay of 45 ms (attempt = 0) W0218 11:09:50.404109 26207 catalog_manager.cc:2648] Async tablet task 83456f56fde245c29db20ffa967275f1 Delete Tablet RPC for TS=293703ef5e1840e39f9b9a260ddae65b failed: Not found: failed to reset TS proxy: Could not find TS for UUID 293703ef5e1840e39f9b9a260ddae65b I0218 11:09:50.449461 26207 catalog_manager.cc:2629] Scheduling retry of 83456f56fde245c29db20ffa967275f1 Delete Tablet RPC for TS=293703ef5e1840e39f9b9a260ddae65b with a delay of 13 ms (attempt = 0) W0218 11:09:50.449510 26207 catalog_manager.cc:2648] Async tablet task 83456f56fde245c29db20ffa967275f1 Delete Tablet RPC for TS=293703ef5e1840e39f9b9a260ddae65b failed: Not found: failed to reset TS proxy: Could not find TS for UUID 293703ef5e1840e39f9b9a260ddae65b
It's seems the issue is with wrong count of attempts, because for some reason the attempt counter stays 0 and the exponential back-off strategy is not applied.
The log from the test scenario run by the dist_test is attached.
Attachments
Attachments
Issue Links
- relates to
-
KUDU-2323 NON_VOTER replica flapping (repeatedly added and evicted)
- Open