Details
-
Bug
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
None
Description
Hi team, first time posting an issue so I apologize if the format is lacking. My original comment is on ARROW-13685 Github Issue here.
Long story short, our environment is super locked down, and while my application has permission to write data against an s3 prefix, I do not have the ListBucket permission nor can I add it. This does not prevent me from using the "individual" file APIs like pq.write_table but the bucket validation logic in the "dataset" APIs breaks when trying to test for the bucket's existence.
pq.write_to_dataset(pa.Table.from_batches([data]), location, filesystem=s3fs)
OSError: When creating bucket '<my bucket>': AWS Error [code 15]: Access Denied
The same is true for the generic pyarrow.dataset APIs. My understanding is the bucket validation logic is part of the C++ code, not the Python API. As a Pythonista who knows nothing of C++ I am not sure how to resolve this problem.
Would it be possible to disable the bucket existence check with an optional key word argument? Thank you for your time!
Attachments
Issue Links
- relates to
-
ARROW-15906 [C++] S3Filesystem shouldn't create new buckets by default
- Resolved
- links to