Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
Description
Currently this is not supported:
In [37]: table = pa.table({'a': pa.array([1, 2], pa.duration('s'))}) In [39]: table Out[39]: pyarrow.Table a: duration[s] In [41]: pq.write_table(table, 'test_duration.parquet') ... ArrowNotImplementedError: Unhandled type for Arrow to Parquet schema conversion: duration[s]
There is no direct mapping to Parquet logical types. There is an INTERVAL type, but this more matches Arrow's ( YEAR_MONTH or DAY_TIME) interval type.
But, those duration values could be stored as just integers, and based on the serialized arrow schema, it could be restored when reading back in.
Attachments
Issue Links
- links to