Uploaded image for project: 'Zeppelin'
  1. Zeppelin
  2. ZEPPELIN-2515

After 100 minutes R process quits silently and spark.r interpreter becomes unresponsive.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • 0.7.0, 0.7.1, 0.7.2
    • 0.8.0
    • Interpreters, r-interpreter
    • None
    • Ubuntu 16.04.2 LTS Server

    • Important

    Description

      On Zeppelin 0.7.1, an R process quits/crash/gets killed with no apparent reason and Spark session remains alive, so %spark.r interpreter needs to be restarted. On 0.7.0 restarting the interpreter doesn't always work, so Zeppelin requires a full restart.

      These are the steps I followed to produce this behaviour:

      1) Enable log4j debug properties.

      2) Start a brand new instance of zeppelin issuing:
      service zeppelin stop
      service zeppelin start

      3) Open an existing notebook or create a new one and execute this block of code:

      %spark.r
      2+2

      4) Wait for 3) to finish and close the browser. The zeppelin log should report something like this:

      INFO [2017-05-08 12:26:15,879] (

      {qtp423031029-60}

      NotebookServer.java[onClose]:363) - Closed connection to 127.0.0.1 : 33798. (1001) null

      5) Wait several minutes (usually 30 minutes or more) without using zeppelin and without reconnecting via browser and at a certain point R will quit. The interpreter log file should contain something similar to these lines:

      DEBUG [2017-05-08 13:08:00,187] (

      {Exec Stream Pumper} InterpreterOutputStream.java[processLine]:72) - Interpreter output:Error in handleErrors(returnStatus, conn) :
      DEBUG [2017-05-08 13:08:00,188] ({Exec Stream Pumper}

      InterpreterOutputStream.java[processLine]:72) - Interpreter output: No status is returned. Java SparkR backend might have failed.
      DEBUG [2017-05-08 13:08:00,188] (

      {Exec Stream Pumper} InterpreterOutputStream.java[processLine]:72) - Interpreter output:Calls: <Anonymous> -> invokeJava -> handleErrors
      DEBUG [2017-05-08 13:08:00,188] ({Exec Stream Pumper}

      InterpreterOutputStream.java[processLine]:72) - Interpreter output:Execution halted

      6) If you repeat the previouse steps from 3) to 5), R will quit again after at least 30 minutes.

      The machine running Zeppelin is a server and the client is located on another machine.
      The error is reproducible with Zeppelin 0.7.0 and 0.7.1 using Spark 2.1.0 and 2.1.1 (latest available version).

      This issue has been discussed here:
      https://lists.apache.org/thread.html/17d8bbea0b755a9cf866a6d78909493d63b09bfa92a477bc0e28bc1c@%3Cusers.zeppelin.apache.org%3E

      Attachments

        Issue Links

          Activity

            People

              zjffdu Jeff Zhang
              pietrop Pietro Pugni
              Votes:
              1 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: