[ARROW-6749] [Python] Conversion of non-ns timestamp array to numpy gives wrong values - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 0.16.0
Component/s: Python
Labels:
- pull-request-available

External issue URL:
https://github.com/apache/arrow/issues/23088

Description

In [25]: np_arr = np.arange("2012-01-01", "2012-01-06", int(1e6)*60*60*24, dtype="datetime64[us]")                                                                                                                 

In [26]: np_arr                                                                                                                                                                                                    
Out[26]: 
array(['2012-01-01T00:00:00.000000', '2012-01-02T00:00:00.000000',
       '2012-01-03T00:00:00.000000', '2012-01-04T00:00:00.000000',
       '2012-01-05T00:00:00.000000'], dtype='datetime64[us]')

In [27]: arr = pa.array(np_arr)                                                                                                                                                                                    

In [28]: arr                                                                                                                                                                                                       
Out[28]: 
<pyarrow.lib.TimestampArray object at 0x7f0b2ef07ee8>
[
  2012-01-01 00:00:00.000000,
  2012-01-02 00:00:00.000000,
  2012-01-03 00:00:00.000000,
  2012-01-04 00:00:00.000000,
  2012-01-05 00:00:00.000000
]

In [29]: arr.type                                                                                                                                                                                                  
Out[29]: TimestampType(timestamp[us])

In [30]: arr.to_numpy()                                                                                                                                                                                            
Out[30]: 
array(['1970-01-16T08:09:36.000000000', '1970-01-16T08:11:02.400000000',
       '1970-01-16T08:12:28.800000000', '1970-01-16T08:13:55.200000000',
       '1970-01-16T08:15:21.600000000'], dtype='datetime64[ns]')

So it seems to simply interpret the integer microsecond values as nanoseconds when converting to numpy.

Attachments

Issue Links

supercedes

ARROW-2853 [Python] Implementing support for zero copy NumPy arrays in libarrow_python

Closed

links to

GitHub Pull Request #5718

Activity

People

Assignee:: Joris Van den Bossche

Reporter:: Joris Van den Bossche

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 01/Oct/19 10:07

Updated:: 11/Jan/23 07:49

Resolved:: 14/Nov/19 17:19

Time Tracking

Estimated:

Not Specified

Remaining:

Logged:

3h 10m