During a session, if the statestore goes down, the impalads continue execution if they have enough metadata that they've already received from the statestore prior to it's failure.
The impalads can continue execution without the statestore with the stale metadata that they posses. However, when the statestore comes back online, the first membership callback it makes to the impalad hosts, erases the "known_backends" list that the impalads have stored locally.
Therefore, in-flight queries fail(sometimes without propagating the error to the shell ->
Do not erase the list of "known_backends" in each impalad until the statestore has a new list to provide to the impalads.
This bug was found during initial runs of ChaosMonkey on Impala.