[ARROW-16746] [C++][Python] S3 tag support on write - ASF JIRA

Add vote

Watch issue

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: C++, Python
Labels:
- good-second-issue

External issue URL:
https://github.com/apache/arrow/issues/32083

Description

S3 allows tagging data to better organize ones data (https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-tagging.html) We use this for efficient downstream processes/inventory management.

Currently arrow/pyarrow does not allow tags to be added on write. This is causing us to scan the bucket and re-apply the tags after a pyrrow based process has run.

I looked through the code and think that it could potentially be done via the metadata mechanism.

The tags need to be added to the CreateMultipartUploadRequest here: https://github.com/apache/arrow/blob/master/cpp/src/arrow/filesystem/s3fs.cc#L1156

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: André Kelpe

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Dates

Created:: 03/Jun/22 08:09

Updated:: 11/Jan/23 11:46

Agile

View on Board

[C++][Python] S3 tag support on write