Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-6749

[Python] Conversion of non-ns timestamp array to numpy gives wrong values

    XMLWordPrintableJSON

    Details

      Description

      In [25]: np_arr = np.arange("2012-01-01", "2012-01-06", int(1e6)*60*60*24, dtype="datetime64[us]")                                                                                                                 
      
      In [26]: np_arr                                                                                                                                                                                                    
      Out[26]: 
      array(['2012-01-01T00:00:00.000000', '2012-01-02T00:00:00.000000',
             '2012-01-03T00:00:00.000000', '2012-01-04T00:00:00.000000',
             '2012-01-05T00:00:00.000000'], dtype='datetime64[us]')
      
      In [27]: arr = pa.array(np_arr)                                                                                                                                                                                    
      
      In [28]: arr                                                                                                                                                                                                       
      Out[28]: 
      <pyarrow.lib.TimestampArray object at 0x7f0b2ef07ee8>
      [
        2012-01-01 00:00:00.000000,
        2012-01-02 00:00:00.000000,
        2012-01-03 00:00:00.000000,
        2012-01-04 00:00:00.000000,
        2012-01-05 00:00:00.000000
      ]
      
      In [29]: arr.type                                                                                                                                                                                                  
      Out[29]: TimestampType(timestamp[us])
      
      In [30]: arr.to_numpy()                                                                                                                                                                                            
      Out[30]: 
      array(['1970-01-16T08:09:36.000000000', '1970-01-16T08:11:02.400000000',
             '1970-01-16T08:12:28.800000000', '1970-01-16T08:13:55.200000000',
             '1970-01-16T08:15:21.600000000'], dtype='datetime64[ns]')
      

      So it seems to simply interpret the integer microsecond values as nanoseconds when converting to numpy.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                jorisvandenbossche Joris Van den Bossche
                Reporter:
                jorisvandenbossche Joris Van den Bossche
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 3h 10m
                  3h 10m