Uploaded image for project: 'Zeppelin'
  1. Zeppelin
  2. ZEPPELIN-3027

Notebook interpreter restart makes running notebooks abort

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.7.2, 0.7.3
    • None
    • Interpreters
    • None
    • Important

    Description

      When I set the isolation mode of an interpreter to:

      • Per user: isolated
      • Per note: isolated

      I would expect separate interpreter sessions and processes for at least each notebook. But this behaviour is not observed in the following scenario:

      'user1' has 'notebook1' bound to an interpreter 'i'. When 'user1' is running 'notebook1', 'user2' decides to restart the interpreter 'i' from another 'notebook2' (in the 'Interpreter binding option).

      The behaviour I would expect is for 'notebook1' to finish correctly. But when interpreter 'i' is restarted, 'notebook1' finishes stops with 'ABORT' or throws an error like:

      org.apache.spark.SparkException: Job 1 cancelled part of cancelled job group zeppelin-20171102-121114_940110794
      at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1499)
      at org.apache.spark.scheduler.DAGScheduler.handleJobCancellation(DAGScheduler.scala:1439)
      at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleJobGroupCancelled$1.apply$mcVI$sp(DAGScheduler.scala:799)
      at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleJobGroupCancelled$1.apply(DAGScheduler.scala:799)
      at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleJobGroupCancelled$1.apply(DAGScheduler.scala:799)
      at scala.collection.mutable.HashSet.foreach(HashSet.scala:78)
      at org.apache.spark.scheduler.DAGScheduler.handleJobGroupCancelled(DAGScheduler.scala:799)
      at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1689)
      at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1669)
      at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1658)
      at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
      at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:630)
      at org.apache.spark.SparkContext.runJob(SparkContext.scala:2022)
      at org.apache.spark.SparkContext.runJob(SparkContext.scala:2043)
      at org.apache.spark.SparkContext.runJob(SparkContext.scala:2062)
      at org.apache.spark.SparkContext.runJob(SparkContext.scala:2087)
      at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:936)
      at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
      at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
      at org.apache.spark.rdd.RDD.withScope(RDD.scala:362)
      at org.apache.spark.rdd.RDD.collect(RDD.scala:935)
      at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:278)
      at org.apache.spark.sql.Dataset$$anonfun$count$1.apply(Dataset.scala:2430)
      at org.apache.spark.sql.Dataset$$anonfun$count$1.apply(Dataset.scala:2429)
      at org.apache.spark.sql.Dataset$$anonfun$55.apply(Dataset.scala:2837)
      at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:65)
      at org.apache.spark.sql.Dataset.withAction(Dataset.scala:2836)
      at org.apache.spark.sql.Dataset.count(Dataset.scala:2429)
      ... 47 elided

      The same behaviour is observer with any combination of the binding mode of the interpreters.

      Attachments

        Activity

          People

            Unassigned Unassigned
            cluengo Cristina Luengo Agullo
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: