Uploaded image for project: 'Zeppelin'
  1. Zeppelin
  2. ZEPPELIN-3455

Zeppelin crashes if Spark interpreter is restarted in a hanging state (And no hang timeout)

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.7.3
    • None
    • Interpreters

    Description

      Issue:

      If a user has not kinit'd their keytab, and attempts to run a %spark paragraph, it will hang indefinitely, and if they then try to restart the Spark interpreter, the whole of Zeppelin Crashes.

       

      Enviroment:

      • Zeppelin 0.7.3
      • Spark 2.2
      • Yarn-Client (Kerberized) 
      • User Impersonation Enabled (Per user, in isolated process) 
      • Shiro authenticating users through AD
      • ZEPPELIN_IMPERSONATE_SPARK_PROXY_USER=false
      • ZEPPELIN_IMPERSONATE_CMD='sudo -H -u ${ZEPPELIN_IMPERSONATE_USER} bash -c '

      While this might seem like a crazy setup, it is extremely common in enterprise, as users have differing permissions in the Hadoop environment.

      (I am aware that zeppelin can proxy users if it has its own keytab, but many Zeppelin users cannot do that for now.)

       

      Things to Fix:

      • Firstly, there is seemingly no timeout for failing to initialize the Spark interpreter. (Meaning it hangs forever)
      • Secondly, while it is in the hanging state, restarting the Spark interpreter will crash Zeppelin for everyone. (Sometimes it will come back after a 20+ Min) 

      Attachments

        Activity

          People

            Unassigned Unassigned
            thesuperzapper Mathew Wicks
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: