OK, I think I've found a possible reason. The waitForDocCount method waits until a response comes back with the, well, expected doc counts. But then it drops out of the wait loop the first time a query works.
But then it goes out to each and every node and re-issues the request. This looks to be a 2-shard, 2-replica situation. So here's the theory: the second node hasn't yet opened a new searcher. So the wait loop is satisfied by, say, node2 but the test later looks at node4 (both for shard2) which hasn't completed opening a searcher yet so it fails.
I could not get this to fail locally in 20 runs. So I'll beast the unchanged version some more to see but meanwhile commit this change which I think is more correct anyway and monitor.