Details
-
New Feature
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
0.6.0
-
None
-
Mac OS Sierra 10.12.6
Description
I'm fairly new to pyarrow so I apologize if this is already a feature, but I couldn't find a solution in the documentation nor an existing issue. Basically I'm trying to export pandas dataframes to .parquet files with partitions. I can see that pyarrow.parquet has a way of reading .parquet files with partitions, but there's no indication that it can write with partitions. E.g., it would be nice if there was a parameter in pyarrow.Table.write_table() that took a list of columns to partition the table similar to the pyspark implementation: spark.write.parquet's "partitionBy" parameter.
Referenced links:
https://arrow.apache.org/docs/python/parquet.html
https://arrow.apache.org/docs/python/parquet.html?highlight=pyarrow%20parquet%20partition