Uploaded image for project: 'Zeppelin'
  1. Zeppelin
  2. ZEPPELIN-3455

Zeppelin crashes if Spark interpreter is restarted in a hanging state (And no hang timeout)

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 0.7.3
    • Fix Version/s: None
    • Component/s: Interpreters
    • Labels:

      Description

      Issue:

      If a user has not kinit'd their keytab, and attempts to run a %spark paragraph, it will hang indefinitely, and if they then try to restart the Spark interpreter, the whole of Zeppelin Crashes.

       

      Enviroment:

      • Zeppelin 0.7.3
      • Spark 2.2
      • Yarn-Client (Kerberized) 
      • User Impersonation Enabled (Per user, in isolated process) 
      • Shiro authenticating users through AD
      • ZEPPELIN_IMPERSONATE_SPARK_PROXY_USER=false
      • ZEPPELIN_IMPERSONATE_CMD='sudo -H -u ${ZEPPELIN_IMPERSONATE_USER} bash -c '

      While this might seem like a crazy setup, it is extremely common in enterprise, as users have differing permissions in the Hadoop environment.

      (I am aware that zeppelin can proxy users if it has its own keytab, but many Zeppelin users cannot do that for now.)

       

      Things to Fix:

      • Firstly, there is seemingly no timeout for failing to initialize the Spark interpreter. (Meaning it hangs forever)
      • Secondly, while it is in the hanging state, restarting the Spark interpreter will crash Zeppelin for everyone. (Sometimes it will come back after a 20+ Min) 

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              thesuperzapper Mathew Wicks
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated: