Details
-
Improvement
-
Status: Resolved
-
Minor
-
Resolution: Won't Fix
-
2.7.5, 3.1.0
-
None
-
None
Description
We are running Apache Spark jobs with aws-java-sdk-1.7.4.jar hadoop-aws-2.7.5.jar to write parquet files to an S3 bucket. We have the key 's3://mybucket/d1/d2/d3/d4/d5/d6/d7' in s3 (d7 being a text file). We also have keys 's3://mybucket/d1/d2/d3/d4/d5/d6/d7/d8/d9/part_dt=20180615/a.parquet' (a.parquet being a file)
When we run a spark job to write b.parquet file under 's3://mybucket/d1/d2/d3/d4/d5/d6/d7/d8/d9/part_dt=20180616/' (ie would like to have 's3://mybucket/d1/d2/d3/d4/d5/d6/d7/d8/d9/part_dt=20180616/b.parquet' get created in s3) we get the below error
org.apache.hadoop.fs.FileAlreadyExistsException: Can't make directory for path 's3a://mybucket/d1/d2/d3/d4/d5/d6/d7' since it is a file.
at org.apache.hadoop.fs.s3a.S3AFileSystem.mkdirs(S3AFileSystem.java:861)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1881)
Attachments
Issue Links
- is related to
-
HADOOP-15525 s3a: clarify / improve support for mixed ACL buckets
- Open
-
HADOOP-13278 S3AFileSystem mkdirs does not need to validate parent path components
- Open
- relates to
-
HADOOP-13221 s3a create() doesn't check for an ancestor path being a file
- Resolved