ts_recovery-itest is quite flaky lately (~50% in TSAN builds). I was able to reproduce the flakiness reliably doing:
I tracked the flakiness down to being introduced by
KUDU-1034 (4263b037844fca595a35f99479fbb5765ba7a443). The issue seems to be that the test sets a low timeout such that a large number of requests time out, and with the new behavior introduced by that commit, we end up hammering the master and unable to make progress.
Unclear if this is a feature (and we need to update the test) or a bug