Uploaded image for project: 'Livy'
  1. Livy
  2. LIVY-852

Livy unable to recover upon losing connection with Zookeeper

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Cannot Reproduce
    • 0.6.0
    • None
    • Server
    • None

    Description

      We've noticed that LIVY-732 appears to change Livy's behavior upon loss of connection with Zookeeper. Originally, before this pull request, upon loss of connection with Zookeeper, Livy would exit with an exit code of 1, allowing it to be restarted. At the moment, however, Livy continues to run, but returns a 404 upon interaction with the REST API:

      <html>
      <head>
      <meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/>
      <title>Error 404 </title>
      </head>
      <body>
      <h2>HTTP ERROR: 404</h2>
      <p>Problem accessing /sessions. Reason:
      <pre> Not Found</pre></p>
      <hr /><a href="http://eclipse.org/jetty">Powered by Jetty:// 9.3.24.v20180605</a><hr/>
      </body>
      </html>

      The direct cause of this change in behavior appears to be from the UnhandledErrorListener being converted from a System.exit(1) to throwing a LivyUncaughtException--see lines 74 from server/src/main/scala/org/apache/livy/server/recovery/ZooKeeperStateStore.scala and lines 72 from server/src/main/scala/org/apache/livy/server/recovery/ZooKeeperManager.scala, at https://github.com/apache/incubator-livy/pull/267/files.

       

      As a whole, this change appears to be undesirable, as Livy becomes completely unresponsive after zookeeper reconnects (No logging/error messages are printed out after the uncaught exception is thrown) and needs to be manually checked and restarted. On the other hand, System.exit(1) seems to be a roundabout way of fixing the issue, and specifying a ConnectionStateListener instead of a UnhandledErrorListener might be better.

       

      It would be good to figure out if this line should be reverted to a System.exit(1), or if there is a better way of handling this issue.

       

       

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            jameschen1519 James Chen
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: