Details
-
Improvement
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
None
Description
As a python developer working with different cloud vendors and their storages I'd like to quickly jump to code examples on how to read and write files for each filesystem.
The documentation concerned is the python/filesystem doc: https://arrow.apache.org/docs/python/filesystems.html
I find the information is a bit scattered and could be improved by having the following organisation.
Filesystem Interface
overview of the Pyarrow FS Interface
Usage
Local Filesystem
description
Writing files
code example
Listing files
code example
Reading files
code example
S3 Filesystem
description / configuration
Writing files
code example
Listing files
code example
Reading files
code example
Hadoop Filesystem
description / configuration
Writing files
code example
Listing files
code example
Reading files
code example
Extending to fsspec-compatible filesystems
description
Google Cloud Storage
code example
Azure
code example
That way a developer working on s3 can directly jump to the section of interest and start experimenting with the code examples.
Additionally if new python bindings are created for a "Arrow native" filesystem the documentation can be extended with a new section in same vein as the other.
Attachments
Issue Links
- links to