Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
0.7.3
-
None
Description
Issue:
If a user has not kinit'd their keytab, and attempts to run a %spark paragraph, it will hang indefinitely, and if they then try to restart the Spark interpreter, the whole of Zeppelin Crashes.
Enviroment:
- Zeppelin 0.7.3
- Spark 2.2
- Yarn-Client (Kerberized)
- User Impersonation Enabled (Per user, in isolated process)
- Shiro authenticating users through AD
- ZEPPELIN_IMPERSONATE_SPARK_PROXY_USER=false
- ZEPPELIN_IMPERSONATE_CMD='sudo -H -u ${ZEPPELIN_IMPERSONATE_USER} bash -c '
While this might seem like a crazy setup, it is extremely common in enterprise, as users have differing permissions in the Hadoop environment.
(I am aware that zeppelin can proxy users if it has its own keytab, but many Zeppelin users cannot do that for now.)
Things to Fix:
- Firstly, there is seemingly no timeout for failing to initialize the Spark interpreter. (Meaning it hangs forever)
- Secondly, while it is in the hanging state, restarting the Spark interpreter will crash Zeppelin for everyone. (Sometimes it will come back after a 20+ Min)