[ARROW-6652] [Python] to_pandas conversion removes timezone from type - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Critical
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 0.15.0
Component/s: Python
Labels:
- pull-request-available

External issue URL:
https://github.com/apache/arrow/issues/23003

Description

Calling to_pandas on a pyarrow.Array with a timezone aware timestamp type, removes the timezone in the resulting pandas.Series.

>>> import pyarrow as pa
>>> a = pa.array([1], type=pa.timestamp('us', tz='America/Los_Angeles'))
>>> a.to_pandas()
0   1970-01-01 00:00:00.000001
dtype: datetime64[ns]

Previous behavior from 0.14.1 of converting a pyarrow.Column to_pandas retained the timezone.

In [4]: import pyarrow as pa 
   ...: a = pa.array([1], type=pa.timestamp('us', tz='America/Los_Angeles'))  
   ...: c = pa.Column.from_array('ts', a) 

In [5]: c.to_pandas()                                                                                                        
Out[5]: 
0   1969-12-31 16:00:00.000001-08:00
Name: ts, dtype: datetime64[ns, America/Los_Angeles]

Attachments

Issue Links

relates to

ARROW-6429 [CI][Crossbow] Nightly spark integration job fails

Resolved

links to

GitHub Pull Request #5462

GitHub Pull Request #5471

Activity

People

Assignee:: Joris Van den Bossche

Reporter:: Bryan Cutler

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 21/Sep/19 16:58

Updated:: 11/Jan/23 07:48

Resolved:: 22/Sep/19 00:28

Time Tracking

Estimated:

Not Specified

Remaining:

Logged:

2h 40m