For a given split, CqlRecordReader gets a list of replicas of that split. Then, it tries to create a Cluster object with a ClusterBuilder for each replica separately in a loop, so that if the cluster creation fails for a certain replica, the next replica is tried. Unfortunately it does not work, because the cluster creation does not fail if the provided contact point is down. So, it always selects the first replica regardless of its state.
The solution is quite simple - ClusterBuilder accepts a collection of contact points - at least one of them must be up. So instead of iterating over the replicas we can pass the whole set of them and the driver will select the working one. It will follow some changes in the load balancing policy - I'm going to switch to use
the same similar balancing policy as to the one we use in OSS Spark Connector.