Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-15306

[C++] S3FileSystem Should set the content-type header to application/octet-stream if not specified

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 7.0.0
    • C++

    Description

      By default the S3FileSystem leaves the header alone, which is technically correct as I don't think a content-type should be specified if one isn't known.

      However, the aws-s3-sdk appears to set the content-type to application/xml whenever it is not specified: https://github.com/aws/aws-sdk-cpp/blob/5378016f845fe85e334ffc30319614e7d4dad41f/aws-cpp-sdk-s3/include/aws/s3/S3Request.h#L41

      We could potentially file an issue with the S3 SDK but I'm not sure if that would make any progress (and S3 itself may require a content-type always be present for some reason).

      Since there is no way to avoid specifying a content-type then we should default to application/octet-stream which is a more accurate "I don't know what this file is" than "application/xml".

      The content-type can confuse libraries that try and automatically act on the file based on the content-type. See https://github.com/apache/arrow/issues/11934

      Attachments

        Issue Links

          Activity

            People

              westonpace Weston Pace
              westonpace Weston Pace
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 10m
                  1h 10m