Datatypes are not preserved when a pandas data frame is partitioned and saved as parquet file using pyarrow but that's not the case when the data frame is not partitioned.
Case 1: Saving a partitioned dataset - Data Types are NOT preserved
From the above output, we could see that the data type for age is int64 in the original pandas data frame but it got changed to category when we saved to local and loaded back.
Case 2: Non-partitioned dataset - Data types are preserved
- Python 3.7.3
- pyarrow 0.14.1