Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
Description
From https://github.com/apache/arrow/issues/10492
- The chapter "Writing to Partitioned Datasets" still presents a "solution" with "hdfs.connect" but since it's mentioned as deprecated no more a good idea to mention it.
- The chapter "Reading a Parquet File from Azure Blob storage" is based on the package "azure.storage.blob" ... but an old one and the actual "azure-sdk-for-python" doesn't have any-more methods like get_blob_to_stream(). Possible to update this part with new blob storage possibilities, and also another mentioning the same concept with Delta Lake (similar principle but since there are differences ...)
Attachments
Issue Links
- links to