Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-13685

[C++] Cannot write dataset to S3FileSystem if bucket already exists

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 5.0.0
    • 6.0.0
    • C++

    Description

      I'm trying to write a parquet file to an existing S3 bucket using the new S3FileSystem interface. However, this is failing with an AWS Access Denied error (I do have necessary access). It appears to be trying to recreate the bucket which already exists.

      import numpy as np
      import pyarrow as pa
      from pyarrow import fs
      import pyarrow.dataset as ds
      
      s3 = fs.S3FileSystem(region="us-west-2")
      table = pa.table({"a": range(10), "b": np.random.randn(10), "c": [1, 2] * 5})
      ds.write_dataset(
          table,
          "my-bucket/test.parquet",
          format="parquet",
          filesystem=s3,
      )
      OSError: When creating bucket 'my-bucket': AWS Error [code 15]: Access Denied
      

      I'm seeing the same behavior using S3FileSystem.create_dir when recursive=True.

      s3.create_dir("my-bucket/test_dir/", recursive=True) # Fails
      s3.create_dir("my-bucket/test_dir/", recursive=False) # Succeeds
      

       

      Attachments

        Issue Links

          Activity

            People

              westonpace Weston Pace
              coverman Caleb Overman
              Votes:
              1 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 5h 10m
                  5h 10m