Description
In our automated tests, we are seeing intermittent failures in TestZKFailoverController. I have been unable to reproduce the failures locally, but in examining the code, I found a difference that may explain the failures.
In trunk, HDFS-6440 ( Support more than 2 NameNodes. Contributed by Jesse Yates.) was checked in before HADOOP-11149. TestZKFailoverController times out), which changed the test added in HDFS-6440.
In branch-2, the order was reversed, and the test that was added in HDFS-6440 does not retain the fixes from HADOOP-11149.
Note that there was also a change from HDFS-10985. (o.a.h.ha.TestZKFailoverController should not use fixed time sleep before assertions.) that was missed in the HDFS-6440 backport.
My proposal is to restore the changes from HADOOP-11149. I made this change internally and it seems to have fixed the intermittent failures.