Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-2443

Python client/interface for Ozone

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsAdd voteVotersWatch issueWatchersCreate sub-taskLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Ozone Client
    • None

    Description

      This Jira will be used to track development for python client/interface of Ozone.

      Original ideas: item#25 in https://cwiki.apache.org/confluence/display/HADOOP/Ozone+project+ideas+for+new+contributors

      Ozone Client(Python) for Data Science Notebook such as Jupyter.

      1. Size: Large
      2. PyArrow: https://pypi.org/project/pyarrow/
      3. Python -> libhdfs HDFS JNI library (HDFS, S3,...) -> Java client API Impala uses  libhdfs

      Path to try:

      1. s3 interface: Ozone s3 gateway(already supported) + AWS python client (boto3)
      2. python native RPC
      3. pyarrow + libhdfs, which use the Java client under the hood.
      4. python + C interface of go / rust ozone library. I created POC go / rust clients earlier which can be improved if the libhdfs interface is not good enough. [By Marton Elek]

      Attachments

        1. OzoneS3.py
          2 kB
          Li Cheng
        2. Ozone with pyarrow.html
          33 kB
          Yi-Sheng Lien
        3. pyarrow_ozone_test.docx
          18 kB
          mingchao zhao
        4. pyarrow_ozone_test.docx
          16 kB
          mingchao zhao

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned Assign to me
            timmylicheng Li Cheng

            Dates

              Created:
              Updated:

              Slack

                Issue deployment