Description
When using a ClientConfiguration (or a ZookeeperInstance) to connect to accumulo, the connection will only ever use the first zookeeper in the list provided. When connections are made from a mapreduce job, this can lead to the first zookeeper server becoming overloaded.
The underlying problem seems to be in ClientConfiguration:
ClientConfiguration cc = ClientConfiguration.loadDefault().withZkHosts("host1,host2,host3"); // Will only return "host1" and not "host1,host2,host3" System.out.println(cc.get(ClientConfiguration.ClientProperty.INSTANCE_ZK_HOST));
This is happening because ClientConfiguration extends CompositeConfiguration, a class in commons-configuration. On of the default features is that if you set a property that is a comma-delimited list of values, it creates a list property. If you retrieve the property value using getString, then only the first value in the list will be returned.
It seems there are a couple solutions to this problem:
1. Disable the automatic list parsing. This is trivial by simply calling setDelimiterParsingDisabled(true) in the ClientConfiguration constructor.
2. Change the way all properties that are intended to be a comma-separated list are retrieved such that the appropriate list-based retrieval method is used.
Attachments
Issue Links
- duplicates
-
ACCUMULO-3218 ZooKeeperInstance only uses first ZooKeeper in list of quorum
- Resolved