Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Duplicate
-
0.9.0
-
None
-
None
Description
It seems pandas Timestamps are supported in some places but not others. Specifically, they work in primitive Arrays but not ListArrays:
import pyarrow from datetime import datetime ts = [pd.Timestamp(2017, 1, 1, 12), pd.Timestamp(2018, 1, 1, 12)] dt = [datetime(2017, 1, 1, 12), datetime(2018, 1, 1, 12)] pyarrow.Table.from_pandas(pd.DataFrame(dict(dates=dt))) # OK :) pyarrow.Table.from_pandas(pd.DataFrame(dict(dates=[dt, dt]))) # OK :) pyarrow.Table.from_pandas(pd.DataFrame(dict(dates=ts))) # OK :) pyarrow.Table.from_pandas(pd.DataFrame(dict(dates=[ts, ts]))) # Fail :(
The above code results in:
ArrowInvalid: Error inferring Arrow data type for collection of Python objects. Got Python object of type Timestamp but can only handle these types: bool, float, integer, date, datetime, bytes, unicode, decimal
I guess this should be supported?