It is not for testing the client connection being up. It is simply checking one of the metrics values reported in JMX. I don't know the reason why NumOpenConnections was chosen. The test had worked reliably until the jmx caching was fixed. The values used to be available right away, but now it takes about 10 seconds. So when it's working it adds about 10 more seconds of delay.
But the original author also made a wrong assumption. The assumption was that the reason for the number of connections being 2 is due to having two datanodes. As you have correctly analyzed, this is not true in a MiniDFSCluster. Since the two datanodes are sharing the same JVM, a single connection was shared for the DatanodeProtocol. An additional connection was made for the client. In a real distributed cluster, it would have been 3 connections.
I lean toward fixing the existing check than removing it. First it shouldn't check against the number of datanods, but simply 2. Regarding increasing ipc client idle timeout, it will make test run time longer, which is against what we have been trying to do. An alternative is to add a test resource to reduce the jmx update interval. We could add a hadoop-hdfs-project/hadoop-hdfs/src/test/resources/hadoop-metrics2.properties file with one line containing *.period=1. This will also reduce the run time of a number of test cases that query jmx to verify the result.
What do you think?