Details
-
New Feature
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
3.0.0
Description
the Python docs show that we can pass, say, 1025 partitions
https://arrow.apache.org/docs/_modules/pyarrow/dataset.html
but in R this argument doesn't exist, it would be good to add this for arrow v4.0.0
this is useful, for example, with intl trade datasets:
# d = UN COMTRADE - World's bilateral flows 2019
# 13,050,535 x 22 data.frame
d %>%
group_by(Year, `Reporter ISO`, `Partner ISO`) %>%
write_dataset("parquet", hive_style = F)
Error: Invalid: Fragment would be written into 12808 partitions. This exceeds the maximum of 1024
Attachments
Issue Links
- is duplicated by
-
ARROW-12363 Write tests for max_partitions argument
- Closed
- links to