I was talking with Alex Moundalexis this weekend about some work that he had recently done in getting Apache Impala set up behind a load balancer. When Kerberos is in the picture, he told me that the way that works is that the impalad daemons actually have two Kerberos identities: one for the hostname that the impalad service is actually running on and another for the load balancer host. The load-balancer continues to just do a simple pass-through.
Right now, the Avatica server can only accept a single Kerberos principal+keytab. This means that we can't use the Kerberos authentication when the client can access the server via multiple hostnames – invalidating the use of 'dumb' load balancers (hypothetically, a smart loadbalancer could make it work). We could configure the Avatica server to use a principal with the load-balancer's hostname, but then users would be unable to connect directly to the server.
I know that Impala uses (or at least exposes) Thrift which has its own SASL implementation; maybe they do something tricky there? Maybe we can glean something from their implementation (even though it's not HTTP based). I don't think JAAS lets us have multiple active logins, so I'm not even sure where to begin.
Ideally, this is something that would be great to understand and provide some deployment guidance for users to have identical deployment scenario for "secure" and "unsecure" scenarios.