Uploaded image for project: 'Zeppelin'
  1. Zeppelin
  2. ZEPPELIN-4599

Zeppelin becomes unresponsive and can only be recovered by restart

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Reopened
    • Major
    • Resolution: Unresolved
    • 0.8.2, 0.9.0
    • None
    • None
    • None

    Description

      We use zeppelin with 10-20 users working primarily in spark. Every few days, and sometimes multiple times per day, the zeppelin webui becomes unresponsive and the only solution we have found is to restart zeppelin. This is extremely disruptive. 

      "Unresponsive" usually takes the form of no longer being able to create new paragraphs, clicking run no longer doing anything or being stuck forever in pending, inability to create new notebooks, or the inability to load notebooks.

      We have tried adding monitoring to the box zeppelin runs on and see nothing out of the ordinary with: GC rates, CPU utilizations, Memory usage, and heap utilization

      We also don't see anything unusual in the logs. Is there any other way we can diagnose this issue to help find the root cause. 0.9 is currently too broken to use (based on a build using the live code on 1/27/2020 and again on 2/3/2020 )

       

      Attaching a copy of logs JIC.

      Attachments

        1. example-interpreter-process-jstack.log
          114 kB
          Paul Brenner
        2. image-2021-02-20-17-13-14-559.png
          100 kB
          Sha Xia
        3. image-2021-02-20-17-13-14-559.png
          100 kB
          Sha Xia
        4. jstack.log
          300 kB
          Sha Xia
        5. zeppelin_server_13_34.log
          1.35 MB
          Sha Xia
        6. zeppelin_server.log
          12.97 MB
          Sha Xia
        7. zeppelin212-2021-03-23.out
          44 kB
          Paul Brenner
        8. zeppelin212-2021-03-26.out
          238 kB
          Paul Brenner
        9. zeppelin9jstack log.out
          63 kB
          Paul Brenner
        10. zeppelin-server-jstack.log
          252 kB
          Paul Brenner
        11. zeppelin-yarn-zeppelin-210.sec.placeiq.net.log
          1.39 MB
          Paul Brenner
        12. zeppelin-yarn-zeppelin-210.sec.placeiq.net.out
          1.23 MB
          Paul Brenner

        Activity

          People

            Unassigned Unassigned
            pbrenner Paul Brenner
            Votes:
            2 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: