Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-5023

supportability: DNS GLSB in front of IMPALA (env: Kerberos+SSL)

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • Impala 2.5.0
    • None
    • Security

    Description

      Issue Description:
      In a hardened cluster (kerberos+ssl) a dns loadbalancer cannot be used in front of the impalad because the user's connection attempt will fail with a: GSS failiure (Wrong principal in request).
      The reason seem to be that the DNS GLSB every time resolve its own address with the ipaddress of a different impalad (as per the DNS LB roundrobin fashion) and when the impalad try to reverse the address it get a different hostname that doesn't match the one in the principal.

      Test performed to check the case:
      Sample Environment description:

      • CDH 5.7.4
      • KERBEROS + SSL
      • DNS GLSB in front of Hive/Impala

      Try to use beeline to connect to the impala JDBC service using the command:

      beeline 'jdbc:hive2:///dns.gslb:21050/;ssl=true;principal=impala/dns.gslb@EXAMPLE.COM'
      

      The impalad refuse the authentication with the error:

      E1229 08:31:22.778329 22628 authentication.cc:155] SASL message (Kerberos (external)): GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (Wrong principal in request)
      

      The DNS GLSB every time resolve the address 'dns.gslb' with the ipaddress of a different impalad (as a DNS LB roundrobin).
      Most probably this happen because the impalad try to resolve and reverse the DNS GLSB hostname that point every time a different impalad (and this will prevent the principal matching during the authentication).

      Impala is correctly configured to accept it as below:

      --be_principal=impala/impala.hostname@EXAMPLE.COM 
      --principal=impala/dns.gslb@EXAMPLE.COM 
      --keytab_file=/var/run/cloudera-scm-agent/process/10247-impala-IMPALAD/impala.keytab
      

      and the impala.keytab contain the principal for the DNS glsb and the impalad it self.

      During our analysis this seems to happen because the impalad try to resolve/reverse the hostname of the DNS GLSB that every time point to a different impalad (as per the DNS LB roundrobin fashion).

      This is not happening with Hive that seems to works with this "dns gslb".

      The DNS GSLB is used by many Companies in Enterprise environment and it is well integrated with an high number of applications.

      Attachments

        Activity

          People

            Unassigned Unassigned
            adriano.simone_impala_b5e7 Adriano Simone
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: