Uploaded image for project: 'Zeppelin'
  1. Zeppelin
  2. ZEPPELIN-4018

[Umbrella] Workflow and orchestration

Attach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments


    • Type: New Feature
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:


      Zeppelin is now only able to support corntab. A note is executed periodically at a specified time.

      In the actual operating environment, The way through corntab is too simple, Workflow orchestration for paragraphs of different interpreters in multiple notes (or a note) in a specific execution order cannot be supported.

      We created a lot of notes in our zeppelin, We urgently need zeppelin to support the layout of the workflow. This can form a closed loop of data processing. Not just an interactive development tool.

      Especially in machine learning, Because machine learning generally has a long task execution.
      A typical example is as follows:
      1) First, obtain data from HDFS through spark;
      2) Clean and convert the data through sparksql;
      3) Feature extraction of data through spark;
      4) Tensorflow writing algorithm through hadoop submarine;
      5) Distribute the tensorflow algorithm as a job to YARN or k8s for batch processing;
      6) Publish the training acquisition model and provide online prediction services;
      7) Model prediction by flink;
      8) Receive incremental data through flink for incremental update of the model;

      Therefore, zeppelin is especially required to have the ability to arrange workflows.

      Please refer to on-going design doc, and add your thoughts:



        Issue Links



            • Assignee:
              liuxun323 Xun Liu
              liuxun323 Xun Liu


              • Created:

                Issue deployment