This test has been failing quite often lately. I poked around a bit and see what I think is evidence of a race condition in CoreContainer.reload where a reload on the same core is happening from two places in close succession. I'll attach a preliminary patch soon.
Without this patch I had 25 failures out of 1,000 runs, with it 0.
I consider this patch a WIP, putting up for comment. Well, it has nocommits so... But In particular, I have to review some changes I made about which name we're using for PendingCoreOps. I also want to back out my changes and beast it again with some more logging to see if I can nail down that multiple reloads are happening before declaring victory.
What this does is put the name of the core we're reloading in pendingCoreOps earlier in the reload process. Then the second call to reload will wait until the first is completed. I also restructured it a bit because I don't like if clauses that go on forever and a small else clause way down the code. I inverted the test and bailed out of the method rather than fall off the end after the else clause.
One thing I don't like about this is two reloads in such rapid succession seems wasteful. Even so, I can imagine that one reload gets through far enough to load the schema, then a schema update changes the schema then calls reload. So I don't think just returning if there's a reload happening on that core already is valid.
More to come.