The HiveServer2 can authenticate a client using via Kerberos and impersonate the connecting user with underlying secure hadoop. This becomes a gateway for a remote client to access secure hadoop cluster. Now this works fine for when the client obtains Kerberos ticket and directly connects to HiveServer2. There's another big use case for middleware tools where the end user wants to access Hive via another server. For example Oozie action or Hue submitting queries or a BI tool server accessing to HiveServer2. In these cases, the third party server doesn't have end user's Kerberos credentials and hence it can't submit queries to HiveServer2 on behalf of the end user.
This ticket is for enabling proxy access to HiveServer2 for third party tools on behalf of end users. There are two parts of the solution proposed in this ticket:
1) Delegation token based connection for Oozie (
This is the common mechanism for Hadoop ecosystem components. Hive Remote Metastore and HCatalog already support this. This is suitable for tool like Oozie that submits the MR jobs as actions on behalf of its client. Oozie already uses similar mechanism for Metastore/HCatalog access.
2) Direct proxy access for privileged hadoop users
The delegation token implementation can be a challenge for non-hadoop (especially non-java) components. This second part enables a privileged user to directly specify an alternate session user during the connection. If the connecting user has hadoop level privilege to impersonate the requested userid, then HiveServer2 will run the session as that requested user. For example, user Hue is allowed to impersonate user Bob (via core-site.xml proxy user configuration). Then user Hue can connect to HiveServer2 and specify Bob as session user via a session property. HiveServer2 will verify Hue's proxy user privilege and then impersonate user Bob instead of Hue. This will enable any third party tool to impersonate alternate userid without having to implement delegation token connection.