Currently pyarrow's parquet writer only writes `_common_metadata` and not `_metadata`. From what I understand these are intended to contain the dataset schema but not any row group information.
A few (possibly naive) questions:
1. In the `_init_` for `ParquetDataset`, the following lines exist:
I believe this should use `common_metadata_path` instead of `metadata_path`, as the latter is never written by `pyarrow`, and is given by the `_metadata` file instead of `_common_metadata` (as seemingly intended?).
2. In `validate_schemas` I believe an option should exist for using the schema from `_common_metadata` instead of `_metadata`, as pyarrow currently only writes the former, and as far as I can tell `_common_metadata` does include all the schema information needed.
Perhaps the logic in `validate_schemas` could be ported over to:
If these changes are valid, I'd be happy to submit a PR. It's not 100% clear to me the difference between `_common_metadata` and `_metadata`, but I believe the schema in both should be the same. Figured I'd open this for discussion.