This Jira will be used to track development for python client/interface of Ozone.
Original ideas: item#25 in https://cwiki.apache.org/confluence/display/HADOOP/Ozone+project+ideas+for+new+contributors
Ozone Client(Python) for Data Science Notebook such as Jupyter.
- Size: Large
- PyArrow: https://pypi.org/project/pyarrow/
- Python -> libhdfs HDFS JNI library (HDFS, S3,...) -> Java client API Impala uses libhdfs
Path to try:
- s3 interface: Ozone s3 gateway(already supported) + AWS python client (boto3)
- python native RPC
- pyarrow + libhdfs, which use the Java client under the hood.
- python + C interface of go / rust ozone library. I created POC go / rust clients earlier which can be improved if the libhdfs interface is not good enough. [By Marton Elek]