[ARROW-2882] [C++][Python] Support AWS Firehose partition_scheme implementation for Parquet datasets - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 0.17.0
Component/s: C++, Python
Labels:

External issue URL:
https://github.com/apache/arrow/issues/19253

Description

I'd like to be able to read a ParquetDataset generated by AWS Firehose.

The only implementation at the time of writting was the partition scheme created by hive (year=2018/month=01/day=11).

AWS Firehose partition scheme is a little bit different (2018/01/11).

Thanks

Attachments

Issue Links

depends upon

ARROW-8039 [Python][Dataset] Support using dataset API in pyarrow.parquet with a minimal ParquetDataset shim

Resolved

Activity

People

Assignee:: Unassigned

Reporter:: Pablo Javier Takara

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 19/Jul/18 13:48

Updated:: 11/Jan/23 07:23

Resolved:: 16/Jun/20 06:10