Akira Ajisaka, I tracked down the issue. Indeed, it was my code that broke the tests. I'm not sure why I wasn't able to reproduce the failures consistently, but maybe that was another side effect of my earlier changes. The bug was that the MiniYARNCluster was explicitly calling transitionToActive on the resource manager with index 0. This overwrote some conf properties and reset the cluster max priority property back to DEFAULT.
I'm attaching a patch that fixes this issue by only transition the RM with index 0 to active if we are in an HA cluster.