Well, perhaps I phrased things poorly, so let's start over and not even consider the test.
bq: A race condition should not put a core in an unrecoverable error situation
Totally agree. If I understand the patch though, the reload operation and the delete operation are competing. In this particular case the reload was caused by changing the configs, but that's largely immaterial. The delete got in there before the reload operation and closed the core. What happens with throwing this new exception is that the reload operation still fails in the sense that the core is still unavailable right? I don't quite see how throwing a new exception and catching it but not adding it to the failures list changes the fact that the core failed to reload; it's still unavailable. How does it ever recover?
Or are you saying that in this case, since the core is deleted it really doesn't need to recover? That still doesn't seem to cover the general case of the core being closed during a reload operation. There's a comment somewhere in the code that perhaps the reload should be retried. It'll still fail in this case, but are there others where reloading will succeed and thus we should retry?
All that said, it'll be at least tomorrow night before I can beast this patch....