Introducing partitioning in write_dataset() creates sub-folders just fine, but the lowest-level subfolder only ever contains a part-0.parquet. I don't see how to get write_dataset() to ever generate output with multiple part-filenames in a single directory, like part-0.parquet, part-1.parquet, etc. e.g. the documentation for open_dataset() implies we should get three `Z` level parts:
But I only get the expected structure with part-0.parquet files.
Context: I frequently need to partition large files that lack any natural grouping variable; I merely want a bunch of small parts of equal size. It would be great if there was an automatic way of doing this; currently I can hack this by creating a partition column with integers 1...n where n is my desired number of partitions, and partition on that. I'd then like to write these to a flat structure with part-0.parquet, part-1.parquet etc, not a nested folder structure, if possible.
(Or better yet, it would be amazing if write_dataset() just let us set a maximum partition file size and could automate the sharding into parts while preserving the existing behavior for actually semantically meaningful groups. Maybe that is already the intent but I cannot see how to activate it!)