Details
-
Bug
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
None
Description
A few odd things are going on with the Python bindings:
- Statefulness. I ran the following code:
import os import pyarrow.fs as arrow_fs def fs_(): s3_fs = arrow_fs.S3FileSystem( access_key="<token>", secret_key="<token>", endpoint_override="<cloudflare r2 url>", ) return s3_fs fs = fs_() print(fs.get_file_info("data"))
and it worked on one machine but not the other. Only setting
region="auto"
allowed the code to work consistently on both computers.
Furthermore, the error message is very opaque:
Traceback (most recent call last): File "cluster_scripts/test_s3.py", line 51, in <module> print(fs.get_file_info("data")) File "pyarrow/_fs.pyx", line 439, in pyarrow._fs.FileSystem.get_file_info File "pyarrow/error.pxi", line 143, in pyarrow.lib.pyarrow_internal_check_status File "pyarrow/error.pxi", line 114, in pyarrow.lib.check_status OSError: When getting information for bucket 'data': AWS Error [code 100]: No response body.
Googling this error gives no information whatsoever. I managed to figure out the issue by switching from Cloudflare to S3, and when the issue was still going on, I explicitly set a region, but the experience was pretty painful.
Attachments
Issue Links
- is related to
-
ARROW-18238 [Python] Improve docs for S3FileSystem / bucket region resolution
- Resolved
- links to