Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
Description
When casting eg timestamp('s') to timestamp('ns'), we do not check for out of bounds timestamps, giving "garbage" timestamps in the result:
In [74]: a_np = np.array(["2012-01-01", "2412-01-01"], dtype="datetime64[s]") In [75]: arr = pa.array(a_np) In [76]: arr Out[76]: <pyarrow.lib.TimestampArray object at 0x7f3d1f07cb88> [ 2012-01-01 00:00:00, 2412-01-01 00:00:00 ] In [77]: arr.cast(pa.timestamp('ns')) Out[77]: <pyarrow.lib.TimestampArray object at 0x7f3d1f07cfa8> [ 2012-01-01 00:00:00.000000000, 1827-06-13 00:25:26.290448384 ]
Now, this is the same behaviour as numpy, so not sure we should do this. However, since we have a safe=True/False, I would expect that for safe=True we check this and for safe=False we do not check this.
(numpy has a similiar casting='safe' but also does not raise an error in that case).
Attachments
Issue Links
- links to