Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
Description
By default the S3FileSystem leaves the header alone, which is technically correct as I don't think a content-type should be specified if one isn't known.
However, the aws-s3-sdk appears to set the content-type to application/xml whenever it is not specified: https://github.com/aws/aws-sdk-cpp/blob/5378016f845fe85e334ffc30319614e7d4dad41f/aws-cpp-sdk-s3/include/aws/s3/S3Request.h#L41
We could potentially file an issue with the S3 SDK but I'm not sure if that would make any progress (and S3 itself may require a content-type always be present for some reason).
Since there is no way to avoid specifying a content-type then we should default to application/octet-stream which is a more accurate "I don't know what this file is" than "application/xml".
The content-type can confuse libraries that try and automatically act on the file based on the content-type. See https://github.com/apache/arrow/issues/11934
Attachments
Issue Links
- links to