Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Not A Problem
-
None
-
None
-
None
-
None
Description
pyarrow parquet writer changes uint32 columns to int64. This change is not made for other types and uint8, uint16, and uint64 columns retain their type.
In [1]: import pandas as pd In [2]: import pyarrow as pa In [3]: import pyarrow.parquet as pq In [5]: df = pd.DataFrame({'a':pd.Series([1,2,3], dtype='uint32')}) In [6]: padf = pa.Table.from_pandas(df) In [7]: padf Out[7]: pyarrow.Table a: uint32 In [8]: pq.write_table(padf, 'pa.parquet') In [9]: pq.read_table('pa.parquet') Out[9]: pyarrow.Table a: int64