We are using flume elasticsearch-sink in AWS and for the hostNames parameter of the sink we use the A record of an internal ELB (Elastic Load Balancer). When we do a nslookup on the load balancer's hostname we get back a list of node IPs (in our case we have 3 elasticsearch nodes).
This config works fine as long as the IPs of the elasticsearch nodes remain the same. If we restart one of our elasticsearch nodes, a new IP is assigned to it and flume stops being able to communicate with that node.
In the source code of the elasticsearch-sink I can see that a list of InetSocketTransportAddress objects is created and this is probably the reason why flume stops working when we have an IP change and starts working only after a restart of the flume-ng-agent service.
- Which is the suggested configuration for our case? Should we use static IPs for our elasticsearch nodes and then use a comma separated list of these IPs in flume configuration?
- Would be possible to use the A record of the ELB in such a way that flume would always hit the A record to get one of the available IP addresses? Does this sound feasible and worth spending some time on submitting a patch?