Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
Impala 2.3.0
-
None
Description
- SaslAuthProvider::RunKinit() has a workloop which gets a new ticket every X seconds(depending on the config).
- It also renews the ticket after 1500 ms after it gets the ticket just so that it can be put into the credential cache. (It has to wait 1500 ms because of a kerberos bug which only shows up in RHEL6 and Kerberos version 1.8.1 and up)
- We believe that if any authentication happens in that window, we will get an error like so:
Couldn't open transport for vd0534.halxg.cloudera.com:22000 (SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (Credential cache is empty))
It is hard to reproduce and confirm, but I will update it here if I am able to.
We also believe that this bug causes an execution of a code path that results in hangs because of a channel being left open on another node indefinitely as documented in IMPALA-2592.
We need to investigate if there is a better way of doing this so as to avoid the window.
The links below explain why we do what we do in RunKinit():
http://www.cloudera.com/content/www/en-us/documentation/archive/cdh/3-x/3u6/CDH3-Security-Guide/cdh3sg_topic_14_2.html
https://jira.cloudera.com/browse/OPSAPS-11159