In a hardened cluster (kerberos+ssl) a dns loadbalancer cannot be used in front of the impalad because the user's connection attempt will fail with a: GSS failiure (Wrong principal in request).
The reason seem to be that the DNS GLSB every time resolve its own address with the ipaddress of a different impalad (as per the DNS LB roundrobin fashion) and when the impalad try to reverse the address it get a different hostname that doesn't match the one in the principal.
Test performed to check the case:
Sample Environment description:
- CDH 5.7.4
- KERBEROS + SSL
- DNS GLSB in front of Hive/Impala
Try to use beeline to connect to the impala JDBC service using the command:
The impalad refuse the authentication with the error:
The DNS GLSB every time resolve the address 'dns.gslb' with the ipaddress of a different impalad (as a DNS LB roundrobin).
Most probably this happen because the impalad try to resolve and reverse the DNS GLSB hostname that point every time a different impalad (and this will prevent the principal matching during the authentication).
Impala is correctly configured to accept it as below:
and the impala.keytab contain the principal for the DNS glsb and the impalad it self.
During our analysis this seems to happen because the impalad try to resolve/reverse the hostname of the DNS GLSB that every time point to a different impalad (as per the DNS LB roundrobin fashion).
This is not happening with Hive that seems to works with this "dns gslb".
The DNS GSLB is used by many Companies in Enterprise environment and it is well integrated with an high number of applications.