Uploaded image for project: 'Comdev GSOC'
  1. Comdev GSOC
  2. GSOC-199

GSoC: Python API CLI enhancement

    XMLWordPrintableJSON

Details

    Description

      About pydolphinscheduler

      PyDolphinScheduler is Python API for Apache DolphinScheduler, which allows you to define your workflow by Python code, aka workflow-as-codes. You could see more detail about PyDolphinScheduler in its document[4]. And all the source code hold as the submodule in DolphinScheduler main codebase[5].

      The Goal

      Make pydolphinscheduler's CLI more powerful, make it can operate the model of DolphinScheduler, run pydolphinscheduler's code, visualize its DAG graph in the terminal.

      Detail

      Up to now, Apache DolphinScheduler Python API has CLI only with limited command supported and our community wishes it to become a more powerful tool and support as much command as possible(unless command has security issue).

      It only supports `version` and `config` for now, which you could see more detail in [1]

      Basically, we think the following command is helpful for CLI and you could add another command if it should be added(but may sure after discussing in the community):

      • `run <DAG name> [--example]`: Run local workflow DAG file or examples build-in
      • `users`: User's operation, CURD
      • `projects`: Project's operation, CURD, grant to other users
      • `tenants`: Tenant's operation, CURD
      • `workflow`: Workflow's operation, CURD, name change, should also change  the local Python file name
      • `visualize`: Show task graph in the terminal.
      • etc...

      Besides the functional addition, we should also consider the output part of CLI which makes our output more clear and cool. We may consider using (we should also find other interesting packages to do it):

      • rich: For highlight, our output, or using some existing rich plugin like `click-rich`
      • tabulate: For the tables visualization in terminal

      What Can You Learn

      We wish everyone joining GSoC could learn some things from the project. When you finish this project, you could learn:

      • How to write production-level Python codes and docs, you could improve your Python syntax, how to write tests with `pytest` and `tox`, how to write a document with `sphnix` and it related plugin, how to format your Python code and the linter inside
      • Adding knowledge about task scheduling system, what is it and what it focuses, how it could be run

      If You Interested in It

      If you want to take this ticket, you should

      • (Must) Python skill, especially packages click, pytest and etc.
      • Have a little knowledge of task scheduling systems.
      • (Optional) Basic Java knowledge is better because Apache DolphinScheduler core is written with Java and you may add some functional code to it.

      Mentors

      • Calvin Kirs: Committer of Apache {DolphinScheduler, SeaTunnel, Wayang}, DolphinScheduler PMC and SeaTunnel PPMC
      • Jiajie Zhong: Committer of Apache {Airflow, DolphinScheduler, SeaTunnel}, SeaTunnel PPMC

       

      [1]: https://dolphinscheduler.apache.org/python/cli.html

      [2]: https://github.com/Textualize/rich

      [3]: https://github.com/astanin/python-tabulate

      [4]: https://dolphinscheduler.apache.org/python/index.html

      [5]: https://github.com/apache/dolphinscheduler/tree/dev/dolphinscheduler-python/pydolphinscheduler

      Attachments

        Activity

          People

            Unassigned Unassigned
            zhongjiajie Jiajie Zhong
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: