Uploaded image for project: 'Zeppelin'
  1. Zeppelin
  2. ZEPPELIN-5140

After Spark Interpreter timeout, there will be no progress when the paragraph rerun

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.9.0
    • Fix Version/s: 0.9.0
    • Labels:
      None
    • Environment:

      zeppelin-0.9.0-SNAPSHOT build from the Master

      Spark-2.4

      Description

      Step1:

      set 

      zeppelin.interpreter.lifecyclemanager.class = org.apache.zeppelin.interpreter.lifecycle.TimeoutLifecycleManager
      zeppelin.interpreter.lifecyclemanager.timeout.threshold	 = 300000

      Now It works well, the paragraph bound with Spark Interpreter is running well while the Progressbar showing the percentage .

      Step2:

      After 5 minutes later, rerun the same paragraph. This time the paragraph's status is PENDING all the time and the Progressbar is missing.

      The reason of this issue:

      1. When RemoteInterpreter expired, TimeoutLifecycleManager will call RemoteInterpreterEventServer.unRegisterInterpreterProcess which only removes the RemoteInterpreterGroup without close it.
      2. When the paragraph runs again, one new RemoteInterpreterGroup is instanced which asks the SchedulerFactory for one RemoteScheduler to submit the paragraph.
      3. SchedulerFactory always find existed RemoteScheduler, so the previous RemoteScheduler which hold the old RemoteInterpreter returned .
      4. The JobStatusPoller which  started by the  RemoteScheduler uses the old RemoteInterpreter to get status, thus an exception was thrown and it fails. 

      How to Fix :

      The way to fix is simple, just add the following codes to the RemoteInterpreterEventServer.unRegisterInterpreterProcess function:

      // Close RemoteInterpreter when RemoteInterpreterServer already timeout. 
      // Otherwise the ProgressBar will be missing when rerun after the RemoteInterpreterServer timeout and old RemoteInterpreterGroups will always alive after GC
      interpreterGroup.close();

       

        Attachments

          Activity

            People

            • Assignee:
              zhengslei Shulei Zheng
              Reporter:
              zhengslei Shulei Zheng
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 0.5h
                0.5h